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Abstract 


The Java language provides a promising so- 
lution to the design of safe programs, with 
an application spectrum ranging from Web 
services to operating system components. 
The well-known tradeoff of Java’s portabil- 
ity is the inefficiency of its basic execution 
model, which relies on the interpretation of 
an object-based virtual machine. Many so- 
lutions have been proposed to overcome this 
problem, such as just-in-time (JIT) and off- 
line bytecode compilers. However, most com- 
pilers trade efficiency for either portability or 
the ability to dynamically load bytecode. 

In this paper, we present an approach 
which reconciles portability and efficiency, 
and preserves the ability to dynamically load 
bytecode. We have designed and imple- 
mented an efficient environment for the ex- 
ecution of Java programs, named Harissa!. 
Harissa permits the mixing of compiled and 
interpreted methods. Harissa’s compiler 
translates Java bytecode to C, incorporat- 
ing aggressive optimizations such as virtual- 
method call optimization based on the Class 


‘This research was supported in part by the Bri- 
tany Council. 


Hierarchy Analysis. To evaluate the perfor- 
mance of Harissa, we have conducted an ex- 
tensive experimental study aimed at compar- 
ing the various existing alternatives to exe- 
cute Java programs. The C code produced 
by Harissa’s compiler is more efficient than 
all other alternative ways of executing Java 
programs (that were available to us): it is up 
to 140 times faster than the JDK interpreter, 
up to 13 times faster than the Softway Guava 
JIT, and 30% faster than the Toba bytecode 
to C compiler. 

Keywords: Java, C, Bytecode, Off-line com- 
pilers, JIT compilers 


1 Introduction 


The Java language [1, 2] provides a promising 
solution to the design of safe programs, with 
an application spectrum ranging from Web 
services to operating system components [3]. 
The success of Java is partly due to the fact 
that its basic execution model relies on the 
interpretation of an object-based virtual ma- 
chine which is highly portable. However, the 
well-known tradeoff of Java’s portability is 
the inefficiency of interpretation. Several so- 
lutions have been proposed to overcome this 


Conference on Object-Oriented Technologies and Systems - June 16-20, 1997 


problem, such as just-in-time (JIT) [4, 5, 6, 7] 
and off-line [8, 9] bytecode compilers. 

Just-in-time systems compile code to na- 
tive form at runtime on demand. This ap- 
proach avoids the overhead of compiling un- 
used code, and eliminates the gap between 
compile time and execution time. Compiling 
during program execution, however, inhibits 
aggressive optimizations because compilation 
must only incur a small overhead. This is 
particularly important in the case of modern 
RISC processors for which complex analyses 
are required to achieve the best result. More- 
over, the quality of the generated code crit- 
ically relies on knowledge about the specific 
features of the target processor. Therefore, 
such compilers are not platform independent 
and requires a large amount of work to be 
ported. 

Off-line compilers does not impose critical 
bounds on compilation time; optimizing anal- 
yses can be run as needed. They can also 
be platform independent, if they generate as 
output an intermediate language. However, 
in the context of Java, many applications dy- 
namically load classes (i.e., bytecode) at run- 
time that limits applicability of pure off-line 
compilers. 

In this paper, we present an approach that 
reconciles portability and efficiency, and pre- 
serves the ability to dynamically load byte- 
code. We have designed and implemented 
an efficient environment for the execution of 
Java programs, named Harissa”. Harissa pro- 
vides a bytecode compiler and an interpreter 
integrated in the runtime library. Thus, a 
compiled program is still able to dynam- 
ically load classes and to interpret them. 
Harissa’s compiler translates Java bytecode 
to C and furthermore incorporates aggressive 
optimizations. 

To evaluate Harissa, we have conducted an 
extensive experimental study aimed at com- 
paring the various existing alternatives to ex- 


?Harissa was previously named Salsa. 
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ecute Java programs. The contributions of 


our work are as follows. 


e The C code produced by Harissa’s com- 
piler is more efficient than all other alter- 
native ways of executing Java programs 
(that were available to us): on the Caf- 
feine Micro-benchmarks [10], it is 5 to 
140 times faster than JDK 1.0.2 inter- 
preter, 2 to 13 times faster than the Soft- 
way Guava JIT [6] and on average 20% 
faster than the Microsoft JIT compiler. 
On real application benchmarks, such as 
the Javac compiler, it is 5 times faster 
than the JDK interpreter and 30% faster 
than the Toba [9] bytecode to C com- 
piler. 


e The compiler statically evaluates the 
stack by abstractly interpreting the byte- 
code and replaces stack management 
with variables. This optimization sup- 
presses one of the main sources of ineffi- 
ciency in Java. 


e The compilation process does virtual- 
method call optimization based on the 
class hierarchy analysis (CHA) [11, 12]. 
On the set of programs used in our 
benchmarks, this analysis permit the re- 
placement of up to 40% of virtual meth- 
ods calls by simple procedure calls. 


e In contrast to existing off-line compilers, 
the runtime system of Harissa includes 
an interpreter that preserves the abil- 
ity of an application to dynamically load 
bytecode. 


e Finally, we discuss the benefits and lim- 
itations of off-line compilation vs JIT 
compilation. Based on our experimen- 
tal study, we show that, for frequently 
used programs, it is always more advan- 
tageous to use off-line compilation rather 
than JIT compilation. 


USENIX Association 
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The paper is organized as follows: Section 
2 describes existing approaches for optimiz- 
ing the execution of Java programs. Sec- 
tion 3 presents Harissa. Section 4 presents 
related work in class hierarchy analysis and 
existing bytecode compilers. Section 5 ana- 
lyzes the performance of the code generated 
by Harissa’s compiler on micro-benchmarks 
and real benchmarks, such as the Javac com- 
piler and the Javadoc documentation genera- 
tor. Section 6 concludes by describing future 
work and comparing JIT and off-line compil- 
ers. 


2 How to Improve Java 
Execution 


Several strategies have been presented to op- 
timize execution of Java programs. They 
range from aggressive compilation schemes 
to specific hardware processors. Advantages 
and drawbacks of these schemes are the fol- 
lowing: 


e Native Java compilers - Compiling 
source code into native code is the most 
common way of compiling a language. 
But this approach is contrary to the 
Java philosophy since all the advan- 
tages of having an platform independent 
language disappear. For instance, the 
source code of Java programs is often not 
available. However, this strategy may be 
useful to obtain very efficient target bina- 
ries for very specific environments. This 
approach is implemented in the Vortex 
project (12). 


e Bytecode compilers - Unlike native 
compiler, bytecode compilers take byte- 
code as input. One of the interesting 
characteristics of Java is that the byte- 
code contains nearly the same amount of 
information as the source itself. It has 
even been shown by Ford [13] and by 


Vliet [14] that it is possible to decom- 
pile the bytecode of a program and pro- 
duce a Java source program similar to 
the original one. This is mostly due to 
the fact that the signature of the classes 
in the program must be kept in the byte- 
code to allow classes to be dynamically 
loaded at runtime. The only significant 
loss of information in the bytecode con- 
cerns structured loops, which are trans- 
formed into goto statements. Hence, a 
bytecode compiler can easily be as effi- 
cient aS a native compiler. There are 
two types of bytecode compilers: those 
that generate native code and those that 
generate an intermediate language, such 
as C. The advantages of these two ap- 
proaches are discussed below. 


Just In Time compilers - A just-in- 
time compiler differs from the a “classi- 
cal” off-line compiler, in that the code 
is compiled only when needed at exe- 
cution time. The difference in perfor- 
mance between those approaches is the 
time that can be spent during execu- 
tion to perform optimizations. Vendors 
such as Borland [4], Symantec [5], Soft- 
way [6], and others have already released 
JIT compilers. The basic scheme is to 
compile a method when it is called for 
the first time, pausing execution while 
doing so. Refinement to this approach 
has been recently described by Plezbert 
and Cytron [7]. They mix interpretation 
and JIT compilation by taking advan- 
tage of multi-threading (on a multipro- 
cessor) 


Java Processors - A Java processor is 
a dedicated processor that implements 
the Java Virtual machine and directly 
executes the Java bytecode. Such pro- 
cessor can be used as the main proces- 
sor in a dedicated Java machine (work- 
stations, embedded systems) or as a Co- 
processor in a workstation. Sun and 
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other manufacturers are already design- 
ing such chips. However, their compet- 
itiveness has not yet been proved [15]. 
Since such processors are not currently 
available, in this paper we only consider 
approaches that do not require specific 
hardware. 


When is an Off-line Bytecode Compiler 
the Right Choice? 


Although Java was originally designed for 
programming embedded applications, it has 
recently spread to many domains. Therefore, 
to choose the appropriate execution scheme 
many factors, such as the frequency of reuse 
of the same code or the heterogeneity level of 
the set of target machines, have to be con- 
sidered. The most frequent situations are the 
following: 


e Small software components inte- 
grated in Web services - These com- 
ponents can undergo frequent changes 
from one load to another by the same 
client. As a result, in this context, a 
JIT compiler is the most appropriate so- 
lution. 


Platform-independent large _ soft- 
ware - Such programs may or may not 
be related to Web services. Java technol- 
ogy is used because of its machine inde- 
pendence. The Java tools themselves are 
examples of such programs (e.g., com- 
piler, disassembler, ...). These pro- 
grams change infrequently and are often 
used by many users. Therefore, keeping 
a local, optimized version of the compiled 
code is advantageous. By comparison to 
a JIT, that always get the latest version 
of the software, this approaches requires 
the management of local optimized ver- 
sions. This can be implemented by a re- 
vision control system that compiles and 
installs new software versions as they are 
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released, in a automatic and transparent 
way. 


e Platform-dedicated software - Ex- 
amples are operating system compo- 
nents [3] and embedded applications. 
For these applications, the Java tech- 
nology provides safety. These applica- 
tions are characterized by very infre- 
quent changes. Hence, it is advantageous 
to optimize the final code for the target 
system. 


Finally, it should be noticed that even some 
statically configured tools, such as Javadoc, 
dynamically choose and load classes at exe- 
cution time. For these applications, it is thus 
worthwhile to combine the binary code with 
an interpreter or a JIT compiler to allow dy- 
namic (over)loading of new features. 


Choosing C as a Target Output 


As was already stated, there are two types of 
off-line bytecode compilers: native and non- 
native. Native compilers produce code that is 
directly executable, while non-native compil- 
ers produce code in an intermediate language. 

Designing a native compiler has two advan- 
tages: (i) the generated binary code may be 
more efficient than that resulting from code 
written in an intermediate language and (ii) 
compilation is fast since it does not require 
successive tools. However, this choice has 
drawbacks: (i) it is not portable and (ii) 
generation of efficient code requires extensive 
knowledge of the features of the target pro- 
cessor. 

Non-native compilers are more flexible and 
also achieve competitive performance. In par- 
ticular, choosing C as an intermediate lan- 
guage permits the reuse of extensive compiler 
technology that has already been developed. 
In fact, 


e There are very good C compilers. 
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e C compilers are available for all ma- 
chines. The developer does not have to 
address subtle differences that exist be- 
tween a processor and its successors. 


e The development process is_ safer, 
quicker, and in some ways simpler 
since optimizations can be done on the 
generated C code. 


e It is possible to reuse existing, aggres- 
sive optimizers such as Suif [16] or par- 
tial evaluators for C such as C-mix [17] 
or Tempo [18, 19]. 


These reasons led us to develop a non-native 
off-line compiler for Java bytecode that gen- 
erates C programs. 


3 Overview of Harissa 


Harissa is a Java environment that includes 
a compiler from Java bytecode to C and a 
Java Virtual Machine integrated in a runtime 
library. While Harissa is aimed at applica- 
tions that are statically configured, such as 
the Javac compiler, it is also designed to allow 
code to be dynamically loaded in an already 
compiled application. This novel feature is 
introduced by integrating a bytecode inter- 
preter into the runtime library. Data struc- 
tures between the Java compiled code and the 
interpreter are compatible and data allocated 
by the interpreter do not conflict with data 
allocated by the compiled code. Harissa is 
written in C and is designed with the pri- 
mary goal of providing efficient and flexible 
execution of Java applications. 

Because Harissa is written in C and 
its compiler generates C code, it is eas- 
ily portable. In fact, current ports include 
SunOS, Solaris, Linux, and Dec Alpha. This 
allows us to compare the effects of optimiza- 
tions on different architectures. 

Because Harissa’s compiler produces C pro- 
grams, various compilers and optimizers can 


be used. As a result, contrary to JIT compil- 
ers, the generated C code does not have to be 
heavily optimized, since final optimizations 
are made by the C compiler. Harissa only 
concentrates on inefficiencies due to the ar- 
chitecture of the Java Virtual Machine: stack 
and method calls. To do so, several trans- 
formations are introduced. First, the stack 
is statically evaluated away. This analysis 
is described in section 3.3. Second, virtual 
method calls are transformed, when possi- 
ble, into static (i.e., procedure) calls. For 
these virtual calls, type checks are also elim- 
inated. This is described in section 3.4. Fi- 
nally, Harissa implements several other opti- 
mizations for object-oriented languages such 
as method inlining, which are not presented. 

The following sections describe the system 
in more detail. 


3.1 Compiling a Java Program 


Harissa’s compiler takes as input a class C 
containing a main method and generates as 
output a makefile, a -main.c file, and a 
C source file for each class used in the pro- 
gram*(see Figure 1). To determine the set 
of classes that depend on the initial class, an 
analysis is recursively performed on the byte- 
code to search for all the classes referenced 
by the main class. Because of the simplicity 
of this phase, it is omitted in the paper. 

Compilation of a method’s bytecode into C 
is organized as follows: 


e Step 1 - The bytecode of the method is 
transformed into an intermediate byte- 
code representation (IBR). The purpose 
of this phase is to obtain a simpler and 
more regular representation. The IBR 


3The system also supports separate compilation 
to reduce compilation time and the size of generated 
code. To do so, the compiler checks if a target class 
exits in the library before translating it. Since sep- 
arate compilation conflicts with class hierarchy anal- 
ysis and method call optimization, it has not been 
used in our benchmarks. 
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class name C 






class_1.c 


bytecode for all classes 
related to C 





Figure 1: Set of generated C files given an initial class name C 


simplifies the implementation of the sub- 
sequent passes by making explicit more 
detailed information than in the original 
Java bytecode. 


Step 2- An analysis determines the value 
of the stack pointer before each instruc- 
tion and the signature type for instruc- 
tions that handle the stack. The result of 
this analysis allow the stack to be stati- 
cally evaluated. 


Step 3 - A class hierarchy analysis is per- 
formed as described in [11]. This anal- 
ysis permits the implementation of fur- 
ther optimizations on the intermediate 
representation. ‘These optimization in- 
clude: method inlining, transformation 
of virtual method calls into static (non- 
virtual) ones, and elimination of type 
checking. Method inlining and conver- 
sion of virtual method calls into non- 
virtual ones are iterated until no oppor- 
tunities for further optimizations remain. 


Step 4 - This phase aims at eliminating 
bound checking. Checks are eliminated 
when it is possible to merge references to 
the same index as well as when the array 
bound and the index can be statically 


determined.‘ 


e Step 5 - The final step generates the C 
code from the IBR. This phase is divided 
into three phases: (i) generation of goto 
labels and exception handling, (ii) decla- 
ration of local variables for the method, 
and (iii) translation of each intermediate 
bytecode instruction into C. 


The following sections present our interme- 
diate bytecode representation and the main 
algorithms that have an impact on perfor- 
mance. That is, the calculation of the 
stack pointer and the types of the stack in- 
structions, and the transformation of virtual 
method calls into static procedure calls. 


3.2 Intermediate Bytecode Rep- 
resentation 


Our intermediate bytecode representation 
has a simpler and more regular syntax, and 
contains more detailed information than the 
original Java bytecode. The main difference 
is that, in the IBR, the types of the arguments 
of the instructions that handle the stack are 


4This phase is not yet implemented in the current 
version of Harissa, although the code is conceived to 
include this optimization. 
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made explicit. This information simplifies 
subsequent passes. 

The data structures defining the IBR are 
shown in Figure 2. A method contains in- 
formation about its body and the exceptions 
that it can raise. Associated with each ex- 
ception is the program counter of its handler. 
The CodeInfo structure has all the informa- 
tion about each instruction. Fields in_sig 
and out-_sig represent the instruction’s input 
and output signature, respectively. This ex- 
plicit representation of signature types eases 
the subsequent analyses. The analysis de- 
scribed in the next section infers the type of 
instructions whose type is not explicit in the 
Java bytecode. 


3.3 Calculation of the Stack 
Pointer and Instruction Sig- 
nature Types 


This analysis statically evaluates the stack 
by calculating the value of the stack pointer 
and the types of all the bytecode instruc- 
tions. Most Java bytecode instructions have 
their type already associated with them, ex- 
cept those that control the stack. Because 
of the constraints enforced by the Java byte- 
code verifier [20], at each program point a 
stack instruction can have only one type sig- 
nature. For example, when the instruction 
DUP is used to duplicate an integer, it can 
not be used at the same program point to 
duplicate a double. Thus, we can straightfor- 
wardly infer the types of the stack operations. 

The analysis of a method via 
CalculateSPandTypes' abstractly _ inter- 
prets each instruction with respect to a 
Stack structure (see Figure 2), which con- 
tains the current stack pointer value and 
the type of its items. AnalyseCode and 
AnalyseExc interpret the method’s body 
and the code fragments corresponding to the 
method’s exception handlers, respectively. 
The stack is initially empty. Abstract 


interpretation of an instruction can modify 
the contents of the stack. If an instruction I 
branches to more than one program point, 
then each branch is interpreted with respect 
to the stack resulting from abstractly inter- 
preting J. Note that for the specific case of 
the jump to subroutine instruction (JSR), 
used to implement exceptions, the stack is 
assumed to be empty before and after the 
execution of the instruction. The JSR and 
RET instructions are considered to have the 
same control flow as a test instruction and 
the RETURN instruction, respectively. ‘This 
approximation is not correct in terms of 
control flow information but gives correct 
results for stack type information. 

Interpretation of an instruction is as fol- 
lows: if the type of the instruction is not ex- 
plicit in the Java bytecode, then the analy- 
sis has to infer it. The input signature is 
inferred from the types on the stack (func- 
tion infer_in_sig). The output signature 
is inferred by abstractly interpreting the in- 
struction with respect to this input signature 
(function infer_out-sig). Once the signa- 
ture is known, then the instruction is ab- 
stractly interpreted with respect to the stack 
and its signature, with functions pop_sig and 
push_sig. The former checks for type con- 
sistency between the input signature and the 
type of the items it pops off the stack and the 
latter pushes the instruction’s output signa- 
ture onto the stack. 


3.4 Transforming Virtual Calls 
into Non-Virtual Calls 


Object-oriented programming’ encourages 
both code factoring and differential program- 
ming. This results in smaller procedures and 
more procedure calls. Procedure calls in an 
object-oriented language are dynamically 
dispatched. ‘There are many analyses tar- 
geted at optimizing dynamically dispatched 
message sends. The most common are: 
intra-procedural static class analysis [11], 
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struct MethodInfo { 


CodeInfo *code; 
ExceptionInfo *einfo; 


} 


struct CodeInfo { 
char *in_sig, *out_sig; 
char opcode; 
list *instr_branch; 


} 


structure Stack = 
int sp; 
char *stack_type; 
} 


struct ExceptionInfo { 
ExceptionInfo *next; 
int handler_pc; 


} 
AnalyseCode (CodeInfo *code, int pc, 
Stack stk) 
{ 
CodeInfo instr; 
instr = get_instr (code, pc); 
if (visited? instr) 
return; 
else 
stk = AnalyseInstr (instr, stk); 
for each instr_branch do 
{ 
stk’ = stk; 
if (instr->opcode == JSR) 
stk’ = empty_stk; 
AnalyseCode (code, branch, stk’); 
iF 


| } 


CalculateSPandTypes (MethodInfo *minfo) { 


AnalyseCode (minfo->code, 0, empty_stk); 
AnalyseExc (minfo, minfo->code) ; 


F 


AnalyseInstr (CodeInfo instr, Stack stk) { 


char *in_sig, *out_sig; 





in_sig = instr->in_sig; 
out_sig = instr->out_sig; 
if (unknown_sig? instr) 


{ 


: 


stk 
stk 
return stk; 


} 


AnalyseExc (MethodInfo *minfo, CodeInfo *code) { 


in_sig = infer_in_sig (stk); 
out_sig = infer_out_sig (in_sig); 


pop_sig (in_sig, stk); 
push_sig (out_sig, stk); 


ExceptionInfo *einfo; 


Stack stk; 


einfo = minfo->einfo; 
while (einfo != NULL) 


} 


{ 


} 


stk = push_item (REF, empty_stk) ; 
AnalyseInstr (code, einfo->handler_pc, stk); 


einfo 


einfo->next; 


Figure 2: Inferring instruction’s type 
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class hierarchy analysis (CHA) [11], and 
profile-guided class receiver prediction [21]. 
In Harissa, we have opted to integrate a class 
hierarchy analysis to address this problem. 

A class hierarchy analysis is a static anal- 
ysis that determines a program’s complete 
class inheritance graph (CIG) and the set of 
methods defined in each class. With the CIG, 
a specific set of possible classes, given that 
the receiver is a subclass of the class C, can 
be statically inferred and messages sent to 
the method’s receiver can be optimized. Fur- 
ther, if there are no overriding methods in 
subclasses, a message sent to the method’s 
receiver can be replaced with a direct pro- 
cedure call and possibly inlined. Inlining of 
a method can trigger other opportunities for 
converting dynamic method calls into static 
ones. Hence, these two transformations are 
iterated. 


3.5 Generation of C Code 


The generation of the C code for a method is 
done in three phases. First, the goto labels 
and exception handlers are generated. Then, 
the local variables of a method are declared 
and, finally, each IBR instruction is trans- 
lated to C. 

Generation of goto labels and declaration 
of local variables are simple and are not dis- 
cussed here. ‘The treatment of exceptions 
needs some explanation. To ensure porta- 
bility, Harissa handles exceptions in a stack- 
based manner. In the Java bytecode, each 
exception has a region associated to it. As 
described in the bytecode verifier documen- 
tation [20], different exception regions are ei- 
ther disjoint or nested, but cannot overlap. 
When translating the intermediate bytecode 
to C, entering of an exception region pushes 
the corresponding exception handler onto the 
stack, and exit of an exception region pops 
the exception handler off the stack. If a jump 
or goto instruction leaves an exception region 
or a set of nested exception regions, the cor- 


responding exception handlers are popped off 
the stack prior to the jump or goto instruc- 
tion. 

The actual generation of the C code from 
the intermediate bytecode representation is 
straightforward. Figure 3-a shows some Java 
source code for a method computing a power 
function, Figure 3-b shows the corresponding 
Java bytecode. Figure 3-c shows the trans- 
lated C code. In the C code, the stack has 
been statically evaluated: variable names pre- 
fixed with “s” are variables that handle the 
stack, while variable names prefixed with “v” 
are user-defined variables. An assignment to 
an s-variable corresponds to pushing a value 
on the stack. A use of an s-variable corre- 
sponds to popping a value off the stack. The 
s-variables can be eliminated either by a C 
compiler or by a C optimizer such as Suif [16]. 
Figure 3-d shows the optimized code gener- 
ated by Suif. 


3.6 Method Call Implementa- 
tions 


The implementation of a class includes a vec- 
tor of function pointers that store the ad- 
dresses of procedure implementing methods. 
Initialisation of this vector is performed when 
intantiating the class either at compile-time, 
by the compiler, or at run-time when dynam- 
ically loading byte-code. After initialisation, 
a pointer may refer either to a C procedure 
(i.e., method) of the compiled class, to a C 
procedure of an inherited compiled class, to 
a C native function of the run-time library, 
or to a stubc procedure. A stubc procedure 
interfaces compiled code with the interpreter: 
it allocates a stack for the interpreter, pushes 
arguments, calls the intrepreter’s entry-point, 
and pops the result. Stubc procedures are 
generated by the compiler for each method 
that might be dynamically overloaded. 
Interface calls are implemented using of 
a two dimensional sparse vector of function 
pointers for each class. The first dimension 
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static int P(int a,int b) 
{ 
int. irs 
r=1; 
for(i=0;i<b;it+t+) r=rea; 
return I; 


a: Java source code 


TINT PC(TINT vi0, TINT vil) 
{ 
TINT sil,si0; 
TINT vi2,vi3; 
sid=1; 
vi3=si0; 
si0=0; 
vi2=si0; 
goto L14; 
L7: 
sid=vi3; 
sil=vi0; 
si0*=sil; 
vi3=si0; 
vi2t=1; 
L14; 
sid=vi2; 
sil=vil; 
if (si0<sil) goto L7; 
sid=vi3; 
return si0; 


c: C generated code 


Method int P(int,int) 
0 iconst_1 
1 istore_3 
2 iconst_0 
3 istore_2 
4 goto 14 
7 iload_3 
8 iload_0O 
9 imul 
0 istore_3 
11." sine: 2°: 
14 iload_2 
15 iload_i 
16 if_icmplt 7 
19 iload_3 
20 ireturn 


b: Java ByteCode 


extern int P(int vi0, int vil) 
t 
int vi2; 
int vi3; 


vi3 = 1; 
vi2 = 0; 
goto L114; 
ee 
vi3 = vi3 * vid; 
vi2 vi2 +1; 
L14; 
if (yvi2°< vit) 
goto L7; 
returm vi3; 


} 


d: Suif optimized code 


Figure 3: Compilation of the power method 
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equals to the total number of interfaces ref- 
erenced by the program, each interface being 
assigned an index at compile time. When a 
class is instantiated, if the class implements 
a given interface, the corresponding second 
dimension of the vector is allocated and is 
initialized with C procedures. 


3.7 Current Status and Limita- 
tions of Harissa 


Harissa is provided in two versions, with and 
without garbage collection (GC). This allows 
us to estimate the influence of GC on its per- 
formance. The GC version is based on the 
Boehm-Demers-Weiser conservative garbage 
collector [22]. The non-GC version relies on 
malloc, which leads to an increase in swap- 
ping and I/O since objects are never deallo- 
cated. 


At the current time, threads are not imple- 
mented. Nevertheless, the system is already 
conceived to include them and the generated 
C code contains the necessary calls to syn- 
chronization functions. Implementation of 
synchronization optimizes the single thread 
case. As long as no additional threads are 
created, synchronization calls point to a null 
procedure. Additional threads creation is de- 
tected by guards [23] that then plug-in the 
multi-thread synchronization function. 


For efficiency, Harissa produces a target C 
that relies on some gcc extensions. This is 
not a major limitation since gcc is available 
on many platforms. We plan to eliminate this 
dependency, in order to be able to test vendor 
C compilers. Finally, there are some native 
libraries, such as the graphic library, that are 
not yet supported. 


4 Related Work 


Other Off-line Compilers 


To our knowledge, there are two other com- 
pilers from bytecode to C: J2C and Toba’. 
Harissa is the only environment that inte- 
grates an interpreter. J2C performs no opti- 
mizations when generating C code (i.e., stack 
evaluation or method call optimization). It 
is still immature and fails for many applica- 
tions. Toba does a stack analysis similar to 
the one included in Harissa and generates C 
code from which transient variables have been 
eliminated. However, Toba does not do any 
method call optimizations. Currently, Toba 
is slightly more mature than Harissa since it 
supports threads. 


Previous Work in CHA 


Compilers for other object-oriented languages 
have included a CHA to optimize dynamically 
dispatched calls. In [24], Vortex, an optimiz- 
ing compiler for object-oriented languages is 
presented. Vortex differs from Harissa in the 
following ways. It is a language-independent 
compiler with front-ends for Java, Cecil, 
C++, and Modula-3. Vortex takes as input 
source code. This approach limits its domain 
of use in the case of Java since source code 
is often not available. The optimizations it 
performs range from standard ones, such as 
constant propagation, dead code elimination, 
and method inlining, to optimizations specific 
to object-oriented languages, such as intra- 
procedural static class analysis, class hierar- 
chy analysis [11], and profile-guided class re- 
ceiver prediction [21]. The Vortex compiler 
has been used to study the impact of each 
of these optimizations alone and in combina- 
tion. In [11], it is shown that class hierarchy 
analysis and profile-guided class receiver pre- 
diction are complementary transformations: 


5Harissa and Toba have been developed indepen- 
dently at the same time. 
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the combination of the two produces a com- 
pounding effect. 

Fernandez presents an optimizing linker 
that does class hierarchy analysis of Modula-3 
programs [25]. Optimizations and code gen- 
eration are done at link-time. The problem 
with this approach is that further optimiza- 
tions that can result from transforming vir- 
tual calls into static procedure calls cannot be 
done by the compiler. An optimizing source- 
to-source C++ compiler is presented in [26]. 
The number of virtual method calls are re- 
duced by performing both type feedback [27] 
and class hierarchy analysis. Method inlining 
is done as well. The optimized program is 
compiled by a native host C++ compiler. 


5 Benchmarks 


This section analyzes the performance gain 
that can be expected from an aggressive byte- 
code compiler. We compare execution of 
Harissa compiled programs with several in- 
dustrial JIT compilers, the J2C and Toba 
bytecode compilers, and the JDK 1.0.2 inter- 
preter. 

Performance of JIT compilers is by nature 
sensitive to the target architecture since they 
compile into native code. To get more repre- 
sentative results, we have run the benchmarks 
on two different platforms: a Dell 100Mhz 
Pentium PC and a Sun 85 Mhz Sparcstation 
5 (SS5). On the Pentium, Harissa is com- 
pared with the JIT compilers embedded in 
Netscape 3.0 and Microsoft Internet Explorer 
3.0. On the Sparc, Harissa is compared with 
the Guava JIT compiler from Softway [6]. 

Three different kinds of benchmarks are 
presented: micro-benchmarks, which are used 
to evaluate the efficiency of JIT and off- 
line compilers for pure computations (with- 
out I/O); large benchmarks, which are used 
to compare JIT and off-line compilers for 
real applications that include I/O; and finally, 
benchmarks to evaluate the effectiveness of 


Conference on Object-Oriented Technologies and Systems - June 16-20, 1997 


the CHA for Java applications. 


Summary of results 


Figure 4 summarizes our results. ‘The micro- 
benchmark tests are made using Caffeine 
2.5 [10]. Each Caffeine micro-benchmark 
tests one feature of the Java machine. On 
these tests, Harissa generated code is on av- 
erage 50 times faster than JDK, 5 times faster 
than Softway Guava JIT [6] and 50% faster 
than Microsoft JIT. 


On real application benchmarks, results de- 
pend mainly on how much pure computation 
the program does. On applications domi- 
nated by I/O, suchas JHLZip and JHLUnzip, 
there is not much difference between off-line 
and JIT compilers; JDK is only 1.5 slower 
that Harissa. On applications such as Javac 
and Javadoc which rely on a mixed set of com- 
putation and I/O, Harissa is 5 times faster 
than JDK, 3 times faster than Softway Guava 
JIT and 30% faster than the Toba [9] byte- 
code compiler. On pure computation pro- 
grams, such as an Othello game [28], Harissa 
is 2.6 times faster than Guava, 1.7 faster than 
Toba and 44 times faster than JDK. Toba re- 
sults are missing when it was not possible to 
run it successfully, for reasons described be- 
low. 


Methodology 


Harissa has been configured so that during 
compilation, only methods with a size smaller 
than 100 instructions are inlined. The C code 
generated by Harissa and J2C has been com- 
piled using gcc with the “-O2” option. The 
gcc version used is 2.7.2 on the Sun and 2.7.0 
on the PC/Linux. Toba-generated C code 
has been compiled using Sun’s commercial C 
compiler with the “-xOQ4” option. 
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Figure 4: Execution time normalized to Harissa 


5.1 Caffeine Micro-benchmarks 


The Caffeine micro-benchmarks produce 
numbers, in CaffeineMarks (higher is faster), 
that allow one to compare heterogeneous 
architectures and Java implementations di- 
rectly. Among them, we consider those that 
are related to the compilation scheme and 
that do not rely on graphic computations or 
the garbage collector: 


e Sieve calculates prime numbers under 
2048; 


e String2 tests string concatenation and 
search; 


e Logic executes loops containing decision 
trees; 


e Loops runs several types of integer loops; 


e Floating Point (i.e., FP) simulates the 
calculations needed to rotate 50 three di- 
mensional points by 90 degrees, 5 degrees 
at a time; 


e Method tests how fast the VM performs 
method calls. 


General comments about the results 


The results of our evaluation are presented in 
Table 1 for the SS5 and in Table 2 for the PC. 
The two rightmost columns present Harissa’s 
results with some further optimizations that 
are described below. In general, the PC is 
faster than the Sun. On the S$5, JDK and 
the interpreter embedded in Netscape achieve 
similar results, while the code generated by 
Harissa is 5 to 140 times faster than the 
JDK interpreter. On the PC, Microsoft’s JIT 
compiler seems to be slightly faster than the 
Netscape’s one, except for the tests String2 
and FP, which are twice as fast under Mi- 
crosoft. 


JIT compilers vs Harissa 


The relevance of the micro-benchmarks when 
comparing JIT compilers and Harissa is to 
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Table 1: Comparison between JIT and Harissa on a 85Mhz SS5 (in CaffeineMarks, C'm) 


measure the efficiency of the compilation 
scheme. Since the tests loop on the same 
code, JIT compilers do not loose time dur- 
ing execution waiting for the compilation of 
a method. Furthermore, with the exception 
of the Method test, no method calls are made. 
Harissa’s inter-procedural optimizations such 
as CHA and method inlining thus have very 
little influence on the results. Therefore, 
these tests permit to evaluate precisely the 
quality of the code that is produced by JIT 
compilers. 

Our measurements show that the code gen- 
erated by Harissa’compiler is basically always 
faster than JIT compilers. Nevertheless, the 
results are architecture dependent. On the 
SS5, Harissa is 1.5 (for Steve) to 13 (for logic) 
times faster than the JIT Guava. On the 
PC, results are more balanced and the dif- 
ference in performance between Harissa and 
Microsoft is smaller than between Harissa and 
Guava, with a maximum of 2.5 times faster. 
For two tests, Sieve and FP, Harissa is actu- 
ally twice as slow. 


Improving the performance of the code 
generated by Harissa 


To understand the reasons for the inefficiency 
of the code generated by Harissa for the tests 
Sieve and FP, we have analyzed the assembly 
code generated by gcc. For the Sieve test, 
it appears that the critical loop is about 20 
instructions long. That does not leave much 
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room for possible optimizations. 

We have identified two reasons for ineff- 
ciency, which are in fact due to limitations 
of the gcc optimizer. As expected, transient 
stack variables are eliminated by gcc. But 
further optimizations resulting from variable 
and constant propagation are not triggered. 
For instance, in the Sieve test, stack vari- 
able elimination transforms a “divide by i” 
into a “divide by 2” that could then be ef- 
ficiently transformed into a shift instruction. 
To evaluate the impact of this problem, we 
have used the Suif C optimizer [16] to sys- 
tematically eliminate these variables using a 
combination of the “constant/variable prop- 
agation” and “dead code elimination” passes. 
The effect on the PC is dramatic for the FP 
and Sieve tests, nearly doubling the perfor- 
mance improvement. On the other tests there 
is little or no influence, which shows that this 
situation is not so frequent. On the Sparc, 
the influence of stack variable elimination is 
lower than on the PC. This is because the 
relative cost of processor instructions differs 
significantly between the Sparc and the Pen- 
tium. 

A second source of inefficiency is the fact 
that loops are compiled into bytecode goto 
instructions. Therefore, gcc does not have all 
the necessary information to make the best 
choice regarding caching of temporary results 
in registers. ‘To determine the consequences 
of this problem, we have reconstructed loops 
by hand for the Sieve and FP benchmarks. 
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Table 2: Comparison between JIT and Harissa on a 100Mhz PC-Pentium (in CaffeineMarks, 


C'm) 


On both the Sparc and the PC, there is a 
performance increase between 5% to 10%. Fi- 
nally, it should be noted that after performing 
the optimizations, Harissa’s compiled code is 
about 10% faster than Microsoft’s JIT. 


5.2 Real-Sized Benchmarks 


These benchmarks are used to estimate the 
efficiency of Harissa in a real environment. 
To do so, we have evaluated the execution 
time of a set of programs that either do pure 
computations, substantial I/O, or a mixture 
of both. Pure computation programs are rep- 
resented by an Othello game [28]. File han- 
dling applications (e.g., I/O) are represented 
by JHLZip and JHLUnzip, which insert and 
extract file from an archive without com- 
pression. Mixed computation-I/O programs 
are represented by two Sun’s JDK tools, the 
Javac compiler and the javadoc documenta- 
tion generator, and by Kawa, a scheme inter- 
preter [29]. 

The benchmarks were made in a single-user 
environment to avoid external interferences. 
It was not possible to run benchmarks for JIT 
compilers embedded in the Web browsers for 
security protection reasons. Performance of 
tools such Javac and javadoc depends signif- 
icantly on their input. To get representative 
results, we ran them on a set of large Java 
programs that are available on the net: 
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e Jas generates bytecode from a scheme 
based scripting language [30]. 


e Jax generates tokenizers from regular ex- 
pressions [30]. 


Jell generates a recursive descent parser 
from from a LL(1) grammar [30]. 


e Kawa is a scheme interpreter [29]. 


Comparisons are performed on real exe- 
cution time, which includes waiting for the 
end of I/O, since this corresponds to what 
the user observes. For completeness, we have 
also detailed user and system CPU time spent 
during the execution to measure the efficiency 
of pure computations. 


Detailed Javac results 


Detailed timing of Javac execution are pre- 
sented in Table 3. In comparison with JDK, 
Harissa achieves the highest speedup which 
is greater than 5. ‘Toba is on average 3.3 
times faster than JDK, J2C is about 2.5 times 
faster, and Guava is 1.5 times faster. These 
results clearly show the benefits of the various 
optimizations performed in Harissa. 

We have also compared Harissa’s GC ver- 
sion with the non-GC one. The GC version 
version is 20% faster than the non-GC one. 
This due to the fact that never reclaiming 
objects leads to an increase in swapping, I/O, 
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Table 3: Compilation time of several Java programs 


and in the amount of address space that has 
to be allocated by the system to the process. 


Detailed Javadoc results 


Javadoc is representative of tools that rely on 
the dynamic capabilities provided by Java to 
load bytecode during execution. Therefore, 
it is not possible to execute it with a pure 
bytecode to C compiler such as Toba or J2C. 
Although dynamically loaded classes are in- 
terpreted, most of the execution time is spent 
in the compiled code. Thus, Harissa’s gener- 
ated code is on average 5 times faster than 
JDK and 3 times faster than Guava. 


Other Benchmarks 


JHLZip and JHLUnzip tools [31] insert and 
extract files from an archive. The tested ver- 
sion of these tools does not include compres- 
sion and therefore, execution is dominated 
by I/Os. Our tests have been done using 
the JDK 1.0.2 classes.zip file as input. As 
it could be expected, compilers (off-line and 
JIT) achieve the same level of performance. 
Finally, JDK is only 1.5 slower than the com- 
pilers. 
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The tested implementation of Othello 
game [28] allocates a finite time to the com- 
puter player to solve one move. The depth of 
the search depends on the speed of the gen- 
erated code. We give the time spent to solve 
up to depth 5 on the first move. 


5.3 CHA Evaluation 


The impact of the class hierarchy analysis has 
been studied for many object-oriented lan- 
guages, including Java. It has been shown 
that this analysis can improve program per- 
formance between 23% to 89% [11]. Table 
6 presents the impact of CHA for the pro- 
grams we have benchmarked. It shows that 
our CHA implementation allows between 14% 
to 40% of the virtual call points to be trans- 
formed into procedure calls. 


6 Conclusion and Future 
Work 


The contribution of this work is threefold: 
(i) we have designed a hybrid environment 
for Java, named Harissa, that permit mixing 
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of interpretation with compiled bytecode, (ii) 
we have designed an aggressive bytecode to 
C compiler whose generated code is more ef- 
ficient than other compilers, and (iii) we have 
measured the relative efficiency of code pro- 
duced by off-line and JIT compilers. 


Tradeoffs Between JIT and Off-line 
Compilers? 


The micro-benchmarks presented in sec- 
tion 5.1 clearly show that an optimized off- 
line compiler such as Harissa’s is faster than 
a JIT compiler. The gap between JIT and 
off-line compilers is greater for the SPARC 
than for the Pentium. This is due to the fact 
that binary code for modern RISC processors 
is complex to optimize and requires analyses 
that are hard to run in the short time allo- 
cated to on the fly compilation. 

However, the JIT and off-line strategies 
can be made complementary. As shown by 
Plezbert and Cytron [7], a compilation pro- 
cess can consist of running the unoptimized 
code while another process does aggressive 
compilation on the background. Once the 
optimized code is available, the unoptimized 
code is replaced with the optimized one. 
Since our system already mixes bytecode in- 
terpretation and binary execution, this con- 
tinuous compilation scheme can be incorpo- 
rated easily in Harissa. 


Opportunities for Further Optimiza- 
tions 


While Harissa generated code is already fast, 
our micro-benchmarks show that there are 
still opportunities for improvement. In a near 
future, we plan to integrate an analysis for 
eliminating transient stack variables so as to 
be independent from Suif. Furthermore, we 
are studying the development of a transfor- 
mation phase, based on control flow informa- 
tion, aimed at rebuilding loop constructs. As 
was shown earlier, structured programs are 
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usually better compiled. 

Finally, we also plan to eliminate some type 
and bound checks since close examination 
of the C code generated has demonstrated 
that most of could be evaluated statically by 
means of a simple intra-procedural analysis. 
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Availability 


A binary version of our system is freely avail- 
able by WWW and can be down-loaded from: 
http://www.irisa.fr/compose/harissa- 
/harissa.html 
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Abstract: The Montana C++ programming environment 
provides an API interface to the compiler, which allows 
the compilation process to be extended through 
programmer-supplied tools. This paper investigates the 
feasibility of that interface, using smart pointers as an 
example. Smart pointers are a powerful feature of the C++ 
language that enable a variety of applications, such as 
garbage collection, persistence, and distributed objects. 
However, while smart pointers can be used in much the 
same way as built-in pointers, they are not 
interchangeable. Using the Montana API, smart pointer 
functionality can be introduced for built-in pointers, thus 
enabling built-in pointers that act like smart pointers. We 
provide an overview of the Montana programming 
environment and describes how smart pointers can be 
implemented using the Montana API. 


1. Introduction 


The Montana’ C++ programming environment is a joint 
development effort between IBM’s Software Solutions and 
Research Divisions, and will be the base for a future 
release of IBM’s VisualAge C++ product. Montana 
provides many unique features over traditional C++ 
compilers, most notably support for complete incremental 
compilation and an API interface [Nac96]. 


The purpose of this paper is to assess the feasibility of the 
Montana API interface for extending the compilation 
process to augment built-in language syntax. We have 
chosen the C++ smart pointer support as a basis of 
comparison. In this paper we present a partial smart 
pointer implementation using a Montana extension, where 
built-in pointer operations are modified as part of the 
compilation process, and summarize the results. 


* This work was performed while the author was a member of the C++ 
compiler development group at the IBM Toronto Laboratory. 

' The name "Montana" originated from an architecture meeting in 
which the idea of developing a new compiler with a clean slate was 
referred to as a "blue sky” approach. Since "blue sky" was thought to 
be the motto for the state of Montana (it’s actually "big sky"), that 
became the name of the project. [Nac96] 


2. The Montana C++ Programming 
Environment 


The Montana project grew from the recognition that 
current C++ development environments, while improving, 
were lacking in many areas, especially compared with 
those available for languages such as Smalltalk. One of 
the major frustrations in developing large C++ 
applications is the build turnaround time. The goal for 
Montana is to provide extremely fast incremental 
compilation, so that recompilation time required is 
proportional to the size of the change. In particular, 
changing a header files should not force recompilation of 
all files that happen to include it. 


The design goal for the Montana architecture 1s that it can 
be extended in a variety of ways. A good example is the 
Montana object model’ support. Most C++ compilers 
support a single native object model, the semantics for 
which are entrenched in the compiler itself, making it 
difficult to support different object, such as DirectToSOM 
C++ [Ham96] or other industry object models. Montana, 
however, was designed so that the object model is 
supported through a well-defined interface. A new object 
model can be added without requiring massive changes 
throughout the compiler. At the time of writing, the author 
was responsible for the design and development of such 
non-native object models. 


Montana is designed around a system called CodeStore 
[Bar94]. CodeStore consists of a C++ parser, a database 
that contains the compiled C++ program representation, 
and a class library that provides an API interface to the 
compilation process and program representation. Using 
this class library interface, C++-knowledgeable tools such 
as browsers can query the program representation of a 
compiled C++ program. In addition, CodeStore tools 
called extensions can be written that interact with the 
compilation process. 


There are three types of extensions [Sor96]: 1) CodeStore 


2 : ; . ; ‘ 

By object model, we mean issues such as how objects are laid out in 
memory and the strategy used to support virtual functions and bases. 
See [Lip96] for a detailed discussion. 
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extensions, which add data to the CodeStore and have 
incremental update capability, 2) incorporation’ 
extensions, that modify or observe the incorporation 
process directly and 3) user interface extensions, which 
allow additional artifacts such as buttons and menus to be 
added to the user interface display. An example of the first 
type of extension is a separate compiler that is triggered as 
part of the compilation process to handle different file 
types, while an example of an incorporation extension 1s 
a tool that interacts directly with the compilation process 
itself, querying or updating the result. In this paper, we 
will concentrate on the second form of extension. 


3. Smart Pointers 


Smart pointers are a powerful feature of the C++ language 
that enable a variety of applications, such as garbage 
collection, persistence, and distributed objects. They are 
used to augment the functionality of C++ pointer 
operations, allowing the programmer to _ perform 
additional work when pointers are created and used. 


Smart pointers essentially allow a user-defined exit added 
to be pointer operations. A smart pointer [Stro89] itself is 
an instance of a class that wraps a built-in pointer, for 
which the dereference operator -> has been overloaded, 
as shown in Figure 1 (smart pointers are typically defined 
using templates however). Such objects can be used in 
much the same way as a built-in pointer, but have 


| #include <iostream.h> 


struct S { 
int i; 


class SP { 
S *_p; 
public: 
SP(S *p) : _p(p) {)} 
S* operator->() { 
cout << "dereferencing" << endl; 
return _D; 


}3 
int main() 
{ 
SP sp(new S); 


Bsp->i = 10; // sp.operator->()->i 





Figure 1 Simple Smart Pointer Class 


3 ee - 

Incor poration is the Montana term for recompiling a program, in 
which the changes to the source will be incorporated into the 
CodeStore database. 
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additional functionality provided through operator 
overloading. In much the same way as inheritance, smart 
pointers can be used in C++ to extend the functionality of 
a class. However, while inheritance extends _ the 
functionality of class instances themselves, smart pointers 
are used to extend the environment containing the 
instance. In other words, smart pointers are used to 
modify how the programming environment operates on an 
object, rather than how the object operates on itself. Smart 
pointers have a wide variety of uses, from simple 
applications such as detecting null dereferences, 
debugging, and read-only pointers [Alg95], to more 
complex applications such as garbage collection [GC96], 
[Ede92a], and persistence [Coh96]. 


In general, smart pointers can be used in exactly the same 
way as built-in pointers, however, as described in 
[Ede92b], there are some important differences between 
the two with respect to implicit type conversions 
performed by the compiler. These fall into two major 
categories: 1) class hierarchies and 2) types qualified with 
const or volatile. A further issue, described in 
[Mey96a] and [Mey96b], is testing for nullness. When 
using built-in pointers, the compiler implicitly performs a 
variety of conversions between pointer types. Examples 
are T* toconst T*, Derived* to Base’, and T* to 
void*. These implicit conversions are not supported for 
smart pointer types. 


If, however, the "smarts" of a smart pointer could be 
added to a built-in pointer, these problems would be 
alleviated. In this section, we describe the changes that 
would be needed to built-in pointer expressions in order 
that they operate as smart pointers. For the purpose of this 
example, we will implement a reference counting smart 
pointer. In subsequent sections, we will describe how to 
implement this model using a Montana incorporation 
extension. 


3.1 A Reference Counting Smart Pointer 


The basic model for a reference-counting smart pointer is 
as follows: 


1) Whenever a new reference is made to a given object, 
the reference count for that object should be 
incremented. 

2) If areference to an object is removed, the reference 
count for that object should be decremented. If the 
reference count for an object goes to zero, delete the 
object. 


These rules are illustrated by the functions increment 
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void decrement (ReferenceCounter “*sp) 
{ 
if (!sp) 
return; 
if (! --sp->rc) 
delete sp; 
} 


void increment (ReferenceCounter *sp) 


{ 


if (!sp) 
return; 
++BD->Irc; 


Figure 2 Reference Counter Functions 


and decrement shown in Figure 2. 


In order to add reference counting smart pointer 
functionality to built-in pointers operations, the following 
expression transformations are required: 


Pointer assignment: Whenever an assignment is made to 
a designated smart built-in pointer, the reference count for 
the object originally pointed to should be decremented and 
that of the object now pointed to should be incremented. 
Thus, the expression p1 = p2 becomes: 


(pl == p2 ? OQ 
pl = p2, increment(pl1), 


decrement (pl), 
pl) 


Pointer initialization: A designated smart built-in pointer 
must always be either explicitly initialized to a value, or 
to zero. (If a pointer were not initialized to zero and 
contained non-zero garbage, a subsequent assignment to 
that pointer using the previous expression would likely 
result in an exception). 


The statement SPC* p1; becomes: 
SPC< pl =- Oe 
and SPC** pl = new SPC*; becomes: 


SPC4* pl new core; ol er Sp LS Oi 0 

If a smart pointer is initialized to a value, the reference 
count for the underlying object must be incremented. So 
the statement SPC* pl = p2; becomes: 

SPC* pl = p2; increment (pl); 

Object {nitialization: When a designated smart built-in 
pointer object is created, the reference count must be 
initialized. For dynamically-created objects, the count 
should be initialized to 0, and for static or automatic 





objects, the reference count should be initialized to 1 so 
that the object can be used in reference counting contexts, 
but will never be deleted. 


Pointer destruction: When a designated smart built-in 
pointer is destroyed the reference count for the referenced 
object must be decremented. There are several ways that 
a smart pointer will be destroyed, the most common being 
that it goes out of scope. Other possibilities are that a 
dynamically allocated smart pointer is deleted, or an 
exception occurs in which the containing block is 
unwound from the stack. Only the deletion of a 
dynamically-allocated smart pointer consists of an 
expression that can be transformed. The other two require 
modifications to the function itself so that the scope 
termination and exception handling code will include the 
decrement of any smart pointers declared therein. 


3.2 Which Built-in Pointers Become Smart? 


The above discussion raises the question of how to 
determine which built-in pointer operations should be 
transformed into smart pointer operations. One could 
blindly apply the transformation to all built-in pointer 
operations, but this would certainly be overkill. Rather, we 
would like to select only specific pointers for the 
transformation The approach that we have chosen 1s to 
define a special base class, ReferenceCounter. 
Expressions involving objects declared of, or pointers to, 
a class derived from ReferenceCounter will be 
transformed as described above. 


For example, consider the built-in pointers declared of 
type C* in Figure 3. Because class C is derived from 
ReferenceCounter, several transformations should 
take place. cp1 and cp2 should be implicitly initialized 
to O at the point of declaration, and the assignment from 
cp2 to cpl should be transformed as described earlier. 


: public ReferenceCounter {}; 


class C 


int main() 
{ 
C *cpl, *cp2; 


cpl = cp2; 


Figure 3 Built-in Pointer Operations 


4. Montana CodeStore Architecture 


The previous’ section described the necessary 
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transformations to built-in pointer operations in order to 
implement a reference counting smart pointer. This 
section provides an overview of the Montana CodeStore 
architecture and describes in a generic sense how a 
transformation extension can be added. This will form the 
basis for the remainder of the paper, which describes our 
implementation of smart built-in pointers in Montana 
using an incorporation extension. 


4.1 The Montana Incorporation Process 


As part of the Montana incremental compilation process, 
the compiler separates a source file into regions, where 
each region consists of approximately one declaration. If 
a region, or something that region depends upon, has 
changed since the last incorporation, it 1s re-incorporated. 
Re-incorporation involves a number of standard steps: 
parsing, semantic analysis, transformation, error checking, 
code generation, and incremental linking. In addition, 
dependency arcs are added between CodeStore elements 
so that a change in one region can trigger a re- 
incorporation of a dependent region. For example, a 
region containing a derived class declaration will have a 
dependency on each region containing one of its base 
classes. 


The Montana class CS_CodeStore Is used to represent 
the underlying CodeStore database. This class supports a 
variety of routines to create, query and update the 
CodeStore. An application that operates on a CodeStore 
will contain exactly one instance of the CS_CodeStore 
class. If an incorporation is currently taking place against 
the database, this CS_CodeStore instance will contain 
a reference to an object of type 
CS_IncorporationState, which represents the 
current state of the incorporation. 


4.2 Transformation 


The transformation step involves simplifying expressions 
and statements into a C-like representation. In Montana, 
it is implemented through the class CS_Transformer, 
which is shown in Figure 4. CS_Transformer has 
three versions of the method trans form, corresponding 
to the different types of transformations that are 
supported. Most calls to the 
CS_Transformer::transform methods are for 
statements or initializations. Expression transformations 
typically occur as part of the transformation of their 
containing statement. 


When the compiler needs to transform an item, it obtains 
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a transformer object from the CodeStore’s incorporation 
state object. The incorporation state in turn retrieves the 
transformer from an implementation component factory 
(see Figure 5). The incorporation state maintains a list of 
implementation component factories, and selects the 
transformer returned by the front element in the list. 


An implementation component can be one of four types: 
a type analyzer, a diagnostician, a transformer, or an 
optimizer. Each of these implementation components take 
part in a specific portion of the incorporation process, and 
can be overridden to modify the compilation process. 
Applications can provide custom implementation 
components by subclassing the implementation 
component factory class and providing overrides for the 
methods of interest. By inserting this new class at the 
front of the incorporation state list, the incorporation state 
will select the overridden component provided. 


For example, to provide a transformer (incorporation) 
extension, the class 
CS_ImplementationComponentFactory would 
be derived from, supplying a transformer method that 
would return the custom transformer object. The 


incorporation state method 
prependiImplementationComponentFactory 


would be called to add this factory to the front of the list. 
Any factory methods that are not overridden would return 
the result of invoking that method against the next factory 
in the list, as shown in Figure 5 with the method 
invocation against the result of the next method. 


Montana supplies a default implementation component 
factory that provides the standard implementations for 
each component. When no extensions have been 
introduced, this factory will be at the front of the 
incorporation state’s list. Figure 6 shows the relationship 
between the various classes discussed in this section. 


5. Implementing Smart Pointers with 
Montana 


In this section, we will present our implementation of 
smart pointers through built-in pointers using a Montana 
transformation incorporation extension. The complete 
implementation is included in the Appendix A, and we 
have extracted specific pieces to clarify the explanation. 
We will first describe how to add our specific 
transformation extension, and then present our 
implementation for reference counting smart pointers 
based on the required expression transformations 
discussed earlier. 
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class CS_Transformer : public CS_IncorporationComponentBase<CS DepthFirstModifier> 
{ 
public: 

CS_Transformer(CS_IncorporationState& s) : 
CS_IncorporationComponentBase<CS_DepthFirstModifier>(s) { } 


// Transform a statement tree 

4] 

virtual CS_bool transform(CS_ Statement*& stmt, CS_bool emitMessages) 
{ modifyStatement (*stmt); return CS_true; } 


// Transform a variable initializer 
// 
virtual CS_bool 
transform(CS Initializer*& init, CS_VariableDeclaration& var, CS_bool emitMessages) 
{ init = &modifyInitializer(*init, &var.typeDescriptor(), &var); return CS true; } 


// Transform an expression tree 


// 
virtual CS_Expression& transform(CS_Expression& expr, CS_bool emitMessages) 


{ return modifyExpression(expr); } 





Figure 4CS_ Transformer 


class CS_ImplementationComponentFactory : public CS_Link<CS_ImplementationComponentFactory> 
{ 
public: 

virtual CS_TypeAnalyzeré& typeAnalyzer() return next()->typeAnalyzer(); } 

virtual CS_Diagnostician& diagnostician() return next ()->diagnostician(); } 

virtual CS Optimizer& optimizer () return next ()->optimizer(); } 

virtual CS_Transformer& transformer () return next ()->transformer(); } 





Figure 5CS_ImplementationComponentFactory class 


CS_CodeStore 


CS_IncorporationState 


CS_|mplementationComponentFactory 
CS_Transformer 





Figure 6 Relationship Between Classes 
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SmartPointerImplementationComponentFactory:: 


SmartPointerImplementationComponentFactory(CS_IncorporationStateé& s) 


_ state(s), 


// The second argument to this constructor comes from pull on the 
// chain of components stored in the IncorporationState. 


// 


_transformer(new SmartPointerTransformer(_ state, 
_state.implementationComponentFactory().transformer() ) ) 


{ 


assume( transformer) ; 


} 


CS_Transformer& SmartPointerImplementationComponentFactory:: 


transformer () 
{ 
assume(_transformer) ; 
return *_ transformer; 


} 


SmartPointerImplementationComponentFactory:: 


~SmartPointerImplementationComponentFactory ( ) 


{ 


delete _transformer; 
} 





Figure 7 SmartPointerImplementationComponentFactory implementation 


class SmartPointerImplementationComponentFactory 


: public CS_ImplementationComponentFactory { 
public: 


Smart PointerImplementationComponentFactory(CS_IncorporationStates&) ; 


virtual CS Transformer&é transformer(); 


virtual ~SmartPointerImplementationComponentFactory(); 


private: 
CS_IncorporationStateé& _state; 
CS_Transformer* transformer; 


); 





Figure 8 SmartPointerImplementationComponentFactory class 


5.1 Creating a Transformation Incorporation 
Extension 


As described in the previous section, in order to 
implement a transformation extension, we must create a 
subclass of the CS_ImplementationComponent- 
Factory class and insert an object of this new type at 
the front of the incorporation state’s factory list. Figure 8 


shows the definition of the class 
SmartPointerImplementationComponent- 


Factory and Figure 7 shows the corresponding 
implementation. The method transformer 1s overridden to 
return our custom smart pointer transformer extension. 
The constructor for the factory initializes the 
_transformer member by creating a new object of 
class SmartPointerTrans former. This latter class 
will implement our smart pointer transformer extension, 
and will be discussed in more detail subsequently. Note 
that the transformer extension constructor is passed the 
incorporation state and the current transformer object, 
obtained from the front of the factory list. 
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5.2 Dynamically Loading a Transformation 
Incorporation Extension 


We now have a factory implementation that will return 
our custom transformation extension. The next issue to 
address is how the factory object will be created and 
added to the front of the incorporation state’s factory list. 
This will be achieved by loading a dynamic link library 
(DLL) that contains a static variable whose initialization 
will cause the factory to be created and inserted into the 
list. Then the question is, how is the DLL loaded? We will 
now examine the Montana support for defining and 
loading extensions. 


Externally, a Montana extension is introduced using an 
Incremental C++ Extension, or ice, file. Montana 
searches for and applies ice files at load time according to 
a defined search order. Figure 9 shows an ice file which 
defines an extension called SmartPointer, for which 
the corresponding DLL to load is smartp.d1ll. The 
suffix and prefix information is used to associate a 
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specific file type with a given extension. This interface is 
for CodeStore extensions, but is used as a temporary 
measure until the final interface for incorporation 
extensions is defined. 


Montana programs are compiled by providing a 
configuration file which supplies the various options for 
the compilation. For our purposes, the configuration file 
shown in Figure 10 is used. 


This configuration file indicates that the source file to be 
compiled is t . cpp, the target executable will be t . exe, 
and that an additional source file called dummy . sp will 
also be processed. This latter source file, being an 
unsupported file type, will cause Montana to search the 
ice files for an appropriate extension that handles this file 
type, and load the extension DLL smartp.dll. 


The next step 1s to register an extension dynamic load 
point using a. sStatically-defined variable in_ the 
smartp.dll extension DLL as shown in Figure 11. An 
extension dynamic load point is used to register an 
extension with the compiler. For an incorporation 
extension, the final parameter to the extension dynamic 
load point constructor, the incorporation startup function 
pointer is most important. This function will be run at the 
start of every incorporation and can be used by an 
extension to plug in components into the incorporation 
state. 


[Smart Pointer] 


type=extension 
description=Smart Pointer 


Extension 
dll=smartp.dll 
suffixes=sp SP 
prefix=dummy 





Figure 9 ice File for Smart Pointer Extension 


source type(cpp) src0O = "t.cpp" 
target "t.exe" { source src0O } 


source type(sp) srcl1 = "dummy.sp" 





Figure 10 Montana Configuration File 


CS_ExtensionDynamicLoadPoint 


SmartPointer: :extension_load_point ( 
SmartPointer: :className(), 
SmartPointer: :update, 
SmartPointer: :isChanged, 


SmartPointer: :processOptions, 
EXTENSION_PRIORITY, 


SmartPointer: :incorporationStartup) ; 





Figure 11 Extension Dynamic Load Point 


| class SmartPointer : public CS_InterfaceBase 


ef 
public: 
static const char* className(); 


static void SmartPointer: :incorporationStartup ( 


CS _ExtensionDynamicLoadPointLink&, CS_IncorporationState&) ; 
static CS _DependencyNode: :UpdateResult 
update(CS_ExtensionSource* me, CS_IncorporationState& state, 
CS_ bool emitMessages) ; 
static CS_bool isChanged(CS_ExtensionSource* me) ; 
static void processOptions(CS_ExtensionSource* me, CS_OptionList& options); 
private: 


static CS _ExtensionDynamicLoadPoint extension_load_ point; 


};3 


Figure 12 SmartPointer class 
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void SmartPointer: :incorporationStartup ( 
CS_ExtensionDynamicLoadPointLink6é, 
{ 
cout << _— FUNCTION << endl; 


CS_IncorporationStateé& state) 


Smart PointerImplementationComponentFactory* fac = 
new SmartPointerImplementationComponentFactory (state) ; 


assume(fac); // (our version of "assert" ) 


// Push our new factory with its new Transformer onto the chain 
// stored in the IncorporationState. 
// 

state.prependImplementationComponentFactory(*fac) ; 


return; 


Figure 13 incorporationStartup method 


The main effect then, of constructing the static member 
variable SmartPointer::extension_load_ 
point is that the method SmartPointer:: 
incorporationStartup will be called prior to each 
incorporation. The class SmartPointer and the 
incorporationStartup method are shown in 
Figure 12 and Figure 13. In the incorporation- 
Startup method the newly-created SmartPointer- 
ImplementationComponentFactory object is 
added to the front of the incorporation state’s factory list 
(see Figure 6). Adding this custom factory object to the 
front of the queue will cause any requests made of the 
incorporation state for a transformer object to return the 
custom transformer, Smart PointerTransformer. 


5.3 The SmartPointerTransformer class 


At this point, whenever a transformation takes place, the 
SmartPointerTrans former class (see Figure 14) 
will have control. One of the three overridden 
transform methods shown at the beginning of the class 
will be called depending upon the type of transformation 
taking place: a statement, initialization, or an expression. 
The overridden versions of the 
SmartPointerTrans former methods are shown in 
Figure 15. These transform methods have fairly 
standard implementations. They first call an appropriate 
modify method, and then invoke the transform 
method of the previous element in the component chain 
(given by member variable _parent .) 


Recall that the constructor for the 
SmartPointerTransformer class is passed the 
current transformer object, which is used to initialize the 
data member _parent. Calling the parent transform 
method allows the standard compiler transformations to 
take place after the extension has been run. 
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It is when the modify method is called that the 
transformer extension has an opportunity to modify the 
transformed expression. The compiler-supplied modify 
methods step through the underlying item and calls an 
appropriate modifyxxx method for each entity 
encountered. By overriding methods corresponding to 
expression of interest, the transformer extension can 
modify these expressions. In this case, we have 
overloaded modifyAssignExpression, modify- 
ExpressionInitializer, modifyImplicit- 
Initializer, and modifyDestructorState- 
ChangeExpression (these methods will be explained 
in more detail in the next section). If the underlying 
expression or statement corresponds to one of these four, 
the overloaded method will be called. Each of these 
methods determines if any _ further expression 
transformation is necessary, based on the type of the 
object being operated on. If so, the expression is 
transformed according to the model described earlier for 
transforming built-in pointer operations into smart pointer 
operations. 


5.4 CS_SmartPointerTransformer::modify 


Now we will discuss the implementation of the 
CS_Smart PointerTransformer: :modify 


methods. Each of these methods’ uses the 
SmartPointerTransformer::transformerim 


plementation method to determine if the current 
expression or statement deals with an object of interest. 


This method returns a 
Smart PointerTransformationImplementati 


on that will perform  implementation-specific 
transformations, depending upon the smart pointer type. 
We will discuss this latter class in the next section. 
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class SmartPointerTransformer : public CS_Transformer { 
public: 
SmartPointerTransformer(CS IncorporationState& s, CS_Transformeré& p) 
: CS_Transformer(s), _parent(p), 
_referenceCounterTransformerImplementation(0) { 
} 


~SmartPointerTransformer () ; 


virtual CS_bool transform(CS_Statement*& stmt, CS_bool emitMessages) ; 
virtual CS_ bool transform(CS_Initializer*&, CS _VariableDeclarationé&, CS_bool); 
virtual CS_Expression& transform(CS Expression&, CS_bool); 


virtual CS Expressioné modifyAssignExpression(CS BinaryExpression&) ; 
virtual CS_Initializer& modifyExpressionIinitializer ( 
CS_ExpressionInitializer&, CS_TypeDescriptor*, CS_VariableDeclaration*); 
virtual CS_Initializer& modifyImplicitInitializer ( 
CS_ImplicitInitializer&, CS _TypeDescriptor*, CS_VariableDeclaration*); 
virtual CS Expression& modifyDestructorStateChangeExpression ( 
CS_DestructorStateChangeExpression&) ; 


CS_Expression& typeAnalyze(CS_Expression&) ; 
CS_Initializer& typeAnalyze(CS_Initializer&) ; 


private: 
CS_Transformer& parent; 


// classes for each smart pointer implementation 
ReferenceCounterTransformerImplementation*® _referenceCounterTransformeriImplementation; 


// Return the smart pointer implementation, if any, for the expression 
// The expression must be a pointer to a class derived from a SmartPointer class 
SmartPointerTransformerImplementation*® transformerImplementation(CS TypeDescriptors&) ; 


); 


Figure 14 Smart PointerTransformer class 


CS_bool SmartPointerTransformer::transform(CS_Statement*& stmt, CS_bool emitMessages) 
{ 

modifyStatement (*stmt) ; 

_parent .transform(stmt, emitMessages) ; 

return CS_true; 


} 


CS_bool SmartPointerTransformer:: 
transform(CS Initializer*é& init, CS_VariableDeclaration& var, CS_bool emitMessages) 


{ 
init = &modifyInitializer(*init, &var.typeDescriptor(), &var); 


_parent.transform(init, var, emitMessages) ; 
return CS_ true; 


} 


CS_Expression& SmartPointerTransformer: :transform(CS_Expression& expr, CS_bool emitMessages) 
{ 

CS_Expression *expr2 = &modifyExpression(expr); 

return _parent.transform(*expr2, emitMessages); 





Figure 15 SmartPointerTransformer transform methods 


CS_Expression& SmartPointerTransformer: :modifyAssignExpression(CS BinaryExpression& binary) 


{ 
if (! binary.expressionl().typeDescriptor().isPointer() ) 


return binary; 


SmartPointerTransformerImplementation *ti = 
transformerImplementation(*binary.expressionl() .typeDescriptor().next()); 


return ti ? typeAnalyze(ti->modifyAssignExpression(binary)) : binary; 
} 


Figure 16 modifyAssignExpression method 
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CS_Initializer& SmartPointerTransformer: :modifyExpressionInitializer ( 
CS_ExpressionIinitializer&é init, CS_TypeDescriptor*® td, CS_VariableDeclaration* var) 


{ 
if (ltd) 
return init; 


// need to handle both pointer and object initialization 


SmartPointerTransformerImplementation *ti = 
transformerImplementation(td->isPointer() ? *td->next() : *td); 


££ °C cea) 
return init; 


return typeAnalyze (ti->modifyExpressioninitializer(init, td, var)); 


} 


Figure 17 modifyExpressionInitializer method 


CS_Initializer& SmartPointerTransformer: :modifyImplicitInitializer ( 
CS_ImplicitInitializer& init, CS_TypeDescriptor* td, CS_VariableDeclaration* var) 


{ 
if (!td) 
return init; 


// need to handle both pointer and object initialization 
SmartPointerTransformeriImplementation *ti = 
transformeriImplementation(td->isPointer() ? *td->next () *td); 
if (! ti) 
return init; 


return typeAnalyze(ti->modifyImplicitInitializer(init, td, var)); 





Figure 18 modifyImplicitInitializer method 


CS_Expression& SmartPointerTransformer:: 
modifyDestructorStateChangeExpression(CS DestructorStateChangeExpressioné dsce) 
{ 
return SmartPointerTransformerImplementation:: 
modifyDestructorStateChangeExpression(dsce) ; 


CS_Expression& SmartPointerTransformerImplementation:: 
modifyDestructorStateChangeExpression(CS DestructorStateChangeExpression&é dsce) 
{ 
// save most recent state table entry 
if (dsce.tableEntry() && dsce.tableEntry()->asDestructorStateTableEntry () ) 
_currentDestructorStateTableEntry = dsce.tableEntry()->asDestructorStateTableEntry(); 


return dsce; 
} 


Figure 19 modifyDestructorStateChangeExpression methods 


The modifyAssignExpression method Is shown in 
Figure 16. For our smart pointer implementation, we want 
to detect any pointers assignments where the underlying 
type is derived from a special base class such as 
ReferenceCounter. modifyAssignExpression 
is passed a reference to a CS_BinaryExpression 
object, representing the assignment expression currently 
being transformed. If the type descriptor for the lhs of this 
expression, given by CS_BinaryExpression:: 
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expression] (), is not a pointer, then no further work 
is necessary. If it is a pointer, then the type to which it 
points, given by CS_TypeDescriptor: :next() is 
passed to transformerImplementation to check 
if it points to an object derived from one of the special 
base classes. If a non-null value is returned, the 
modifyAssignExpression of the returned 
implementation object is called to modify the expression. 
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The modifyExpressionInitializer and 
modifyImplicitInitializer methods (Figure 17 
and Figure 18) perform similar functions, but must handle 
both pointer and non-pointer initializations. These 
methods call the transformerImplementation 
with the CS_TypeDescriptor for the non-pointer 
variable being initialized. If the variable being initialized 
is a pointer, the CS_TypeDescriptor for the type 
pointed to, given by td->next (), is passed instead. 


The modifyDestructorStateChange- 
Expression method, shown in Figure 19, does not 
actually perform any modification against the given 
expression. Rather, it calls the static method 
SmartPointerTransformationImplementati 
on: :modifyDestructorStateChangeExpress 
ion, which simply saves the most recently seen state 
table entry if that entry is for a destructor. This value will 
be used later to add information to the state table in the 
appropriate location. 


5.5 SmartPointerTransformerImplementation class 


As discussed earlier, the model for our smart pointer 
implementation is that a built-in pointer will be 
transformed into a smart pointer if the underlying type 
inherits from a special base class. In order to provide 
multiple smart pointer transformations, we have defined 
a common base class called SmartPointer- 
TransformerImplementation (Figure 20), from 
which a derived class will be defined for each smart 
pointer implementation. This derived class will handle 
the transformations specific to that smart pointer 
implementation. 


When a method needs to determine if a smart pointer 
implementation applies to a given expression, it calls the 
method SmartPointerTransformer:: 
transformerImplementation, shown in Figure 
21. The transformerImplementation method is 
passed a_ reference to an_ object of type 
CS_TypeDescriptor, which describes the type of the 
object being operated on. If the object is of a class type, a 
reference to the associated CS_ClassDeclaration Is 
assigned to the variable decl. A 
CS_ClassDeclaration provides complete 
information about a class declaration, such as the class 
name, members. etc. So at this point, dec1 will reference 
the class declaration for the object of interest. 


Next, the findClassDeclaration method of the 


ReferenceCounterTransformerImplementat 
ion class is invoked. This method returns a pointer to the 
CS_ClassDeclaration object representing the class 
ReferenceCounter, if that declaration has been 
encountered in the program, and null otherwise. If non- 
null is returned, the method uses __ the 
CS_ANSI_Queries::isBaseClassOf method to 
determine if the class declaration for the object of interest 
is derived from ReferenceCounter. The 
CS_ANSI_Queries class provides a variety of 
functions that support querying of class declarations. If 


ReferenceCounter is a_ base. class, _ the 
_referenceCounterTransformer- 


Implementation data member will be initialized with 
a new ReferenceCounterTranformer- 
Implementation object if it has not yet been 
initialized. 


In the current implementation, we have defined one 
special base class, ReferenceCounter, and one 
corresponding specialization of 
SmartPointerTransformerImplementation. 
The programmer would include the declaration of 
ReferenceCounter class (see Figure 22 ) and derive 
from it to introduce reference-counting functionality for 
pointer operations against objects of that derived class. 
(The name member in ReferenceCounter is used for 
debugging purposes and will be discussed later). 
Additional smart pointer implementations could be 
introduced by inserting code at the end of the 
transformerImplementation method where 
indicated. 


5.6 ReferenceCounterTransformerImplementation 
class 


The ReferenceCounterTranformerImplementation (see 
Figure 23) class provides the transformer extension 
implementation for a reference counting smart pointer. It 
is a Specialization of the SmartPointer- 
TransformerImplementation, and_ contains 
overrides of the SmartPointerTransformer- 
Implementation: :modify methods, along with 
additional methods specific to implementing a reference 
counting smart pointer. 
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class SmartPointerTransformerImplementation : public 
CS_IncorporationComponentBase<CS InterfaceBase> { 
public: 
Smart PointerTransformerImplementation(CS IncorporationState &state, 
SmartPointerTransformer &transformer) : 
CS_IncorporationComponentBase<CS_InterfaceBase>(state), _transformer(transformer) {)} 


virtual CS_Expression& modifyAssignExpression(CS_BinaryExpression&) = 0; 
virtual CS_Initializer& modifyExpressionIinitializer ( 

CS_ExpressionIinitializer&, CS TypeDescriptor*, CS VariableDeclaration*) = 0; 
virtual CS_Initializer& modifyImplicitInitializer ( 

CS_ImplicitInitializer&, CS_TypeDescriptor*, CS VariableDeclaration*) = 0; 


static CS_Expression& modifyDestructorStateChangeExpression ( 
CS_DestructorStateChangeExpressioné& dsce); 

CS_DestructorStateTableEntry* currentDestructorStateTableEntry () 
{ return _currentDestructorStateTableEntry; } 

void currentDestructorStateTableEntry(CS_ DestructorStateTableEntry *ste) 
{ _currentDestructorStateTableEntry=ste; } 


SmartPointerTransformer& transformer() { return _transformer; )} 
private: 


Smart PointerTransformer& _transformer; 
static CS DestructorStateTableEntry *_currentDestructorStateTableEntry; 


Ve 


Figure 20 SmartPointerTransformerImplementation class 





SmartPointerTransformerImplementation *SmartPointerTransformer:: 
transformerImplementation(CS TypeDescriptoré td) 
{ 
if (! td.isNamedType() | | 
1 td.declaration().declarationKind() == CS Declaration: :IsClass) 
return NULL; 
CS ClassDeclaration &decl = *td.declaration() .asClassDeclaration(); 


CS_ClassDeclaration *referenceCounter = 
ReferenceCounterTransformeriImplementation(state(), *this) .findClassDeclaration(); 
if (referenceCounter && 
CS_ANSI_Queries: :isBaseClassOf (*referenceCounter, decl)) { 


cout << "got a pointer to class derived from " 
<< referenceCounter->signature() << endl; 
if (1! _referenceCounterTransformeriImplementation) { 
_referenceCounterTransformerImplementation = 
new ReferenceCounterTransformeriImplementation(state(), *this); 


} 


return _referenceCounterTransformerImplementation; 


} 


// insert code to look for other smart pointer class implementations here 
return NULL; 





Figure 21 transformerImplementation method 





Class ReferenceCounter { 
private: 
int re; 
char *_name; 


static void dtor(ReferenceCounter **sp, int); 
static void decrement (ReferenceCounter “sp); 
static void increment (ReferenceCounter “sp); 


public: 
ReferenceCounter(char *name) : _name(name) { rc = Q; } 
char *name() { return _name; } 
virtual ~ReferenceCounter(); 


3 


Figure 22 ReferenceCounter class 
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| class ReferenceCounterTransformerImplementation :; 
public SmartPointerTransformerImplementation { 
public: 
ReferenceCounterTransformerImplementation ( 
CS_IncorporationState& state, SmartPointerTransformeré& transformer) : 
Smart PointerTransformerImplementation(state, transformer) {}; 


CS_ClassDeclaration*® findClassDeclaration(); 


CS_Expression& modifyAssignExpression(CS_BinaryExpression& binary) ; 
CS_Initializer& modifyExpressionInitializer ( 

CS_ExpressionInitializer&, CS_TypeDescriptor*, CS_VariableDeclaration*) ; 
CS_Initializer& modifyImplicitInitializer ( 

CS_ImplicitInitializer&, CS_TypeDescriptor*, CS_VariableDeclaration*); 


private: 
const CS_Atom& referenceCountMember (); 
virtual CS_Expression& decrementReferenceCounterExpression(CS_ Expression &); 
virtual CS_Expression& incrementReferenceCounterExpression(CS Expression &); 
virtual CS_Expression& createStateChangeExpression ( 
CS_Expression&, CS_VariableDeclaration&) ; 
virtual CS_FunctionDeclaration& findOrCreateDecrement () ; 
virtual CS_FunctionDeclaration& findOrCreateIncrement (); 
virtual CS_FunctionDeclaration& findOrCreateDtor(); 
virtual CS_FunctionDeclaration& findOrCreateMemberFunction(char *); 
virtual CS_DestructorStateTableEntry&createDestructorStateTableEntry ( 
CS_ VariableDeclaration&, CS_TreeNode&) ; 
virtual void addDestructorCalls ( 
CS_VariableDeclaration&, CS_DestructorStateTableEntry&, CS_TokenLocation&) ; 


}; 


Figure 23 ReferenceCounterTransformerImplementation class 


CS_Expression& 
ReferenceCounterTransformerImplementation: :modifyAssignExpression ( 
CS_BinaryExpression &binary) 
{ 
// Don’t transform expressions on temporaries 
CS_Expression &e = binary.expressionl(); 
if (e.expressionKind() == CS_Expression::IsName && 
e.asNameExpression()->name().declaration() && 
! e.asNameExpression()->name() .declaration()->mapsToASourceLocation()) { 
return binary; 
} 


CS_TokenLocation loc = 
binary. sourceLocation() .sourceRegion()->tokenLocation(); 


CS_Expression &expr = 
ef().createCommaExpression(loc, 
decrementReferenceCounterExpression(binary.expressionl()), 
ef() .createCommaExpression(loc, 
binary, 
incrementReferenceCounterExpression(binary.expressionl()))); 
return expr; 


Figure 24 modifyAssignExpression method 
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CS. Initializeré ReferenceCounterTransformerImplementation: : 





modifyExpressionInitializer(CS_ExpressionInitializer& init, 


CS_TypeDescriptor* td, 


assume (var) ; 


CS_TokenLocation loc = 


CS VariableDeclaration* var) 


init.expression() .sourceLocation() .sourceRegion()->tokenLocation() ; 


if (td->isPointer()) { 
CS_BinaryExpression *be = 
assume (be) ; 


init.expression() .asBinaryExpression()j; 


// Cc *cl(x) will already be transformed to C *cl = x; 


assume (be->binaryExpressionKind ( ) 


CS_ Expression &exprl = be->expressionl1(); 


// change C *cl; cl = x to 
// C *cl; cl = (cl=x, cl != 


init. setExpression ( 
ef().createAssignExpression(loc, 
exprl, 
ef() .createCommaExpression(loc, 
ef () .createAssignExpression(loc, 
ic() .cloneExpression(exprl), 


0 ? cl->rctt: 


== CS_BinaryExpression: :opAssign) ; 


0, cl) 


transformer () .modifyExpression(be->expression2())), 


ef () .createCommaExpression(loc, 


incrementReferenceCounterExpression ( 


exprl), 


ef().createCommaExpression(loc, 


createStateChangeExpression(exprl, 


*var), 


ic().cloneExpression(exprl)))))); 


return init; 
} 
// For non-dynamic variables, 
// to 1 so that never get collected. 
init. setExpression ( 

ef () .createCommaExpression(loc, 


initialize reference count 


transformer ().modifyExpression(init.expression()), 


ef() .createAssignExpression(loc, 
ef ().createDotExpression(loc, 


ef () .createNameExpression(loc, 


referenceCountMember()), 
ef() .createLiteralExpression ( 
cs(), loc, intType(), 1)))); 
return init; 


Figure 25 modifyExpressionInitializer method 


5.6.1 Transforming Pointer Assignments 


As discussed earlier, when an assignment to a reference- 
counting smart pointer occurs, we want to transform an 
expression such as x = y, where x points to a type 
derived from ReferenceCounter, to the following: 


(x cS=n te 
decrement (x), xX = y, 
increment (x), x) 


the 
shown in 


This 1S achieved 
modifyAssignExpression 


through 
method 
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*var), 


Figure 24. The first thing the method does is check if the 
left-hand-side (lhs) of the expression (given by 
binary.expression1()), 1s a compiler-generated 
temporary. Unlike programmer-declared variables, the 
declaration of a temporary will not have a corresponding 
source location. Our implementation does not currently 
handle temporaries, and assignments to such are ignored. 
Temporaries are discussed in more detail in section 7. 


If the target of the assignment is not a temporary, a new 
comma expression 1s created using the result of the 
decrementReferenceCounter and 
incrementReferenceCounter methods along with 


USENIX Association 


USENIX Association 


the current assignment expression. The 


findOrCreateDecrement method called in the 
decrementRefereneCounterExpression 


method locates or creates a declaration corresponding to 
the ReferenceCounter::increment method 
declared earlier. Note that the right-hand-side (rhs) value 
is used three times in the resulting expression, in 
decrement, assignment, and increment expressions. We 
should generate a temporary to hold the rhs value so that 
expressions containing side-effects are not executed 
multiple times. Further, we currently do not generate code 
to handle the initial check for the Ihs being equal to the 
rhs, which requires a temporary for both the lhs and the 
rhs, or the final expression containing just the rhs for the 
assignment value. This support would be added when 
temporaries are handled by our implementation (see 
section 7). 


5.6.2 Transforming Initialization Expressions 


The modifyExpressionInitializer method, 
shown in Figure 25, transforms initialization expressions 
corresponding to the model described earlier. Much of this 
method is fairly self-explanatory. For pointers, however, 


#Hinclude "ReferenceCounterInterface.h" 


class C : public ReferenceCounter { 
public: 
int. i; 


C(char *name) : ReferenceCounter(name) {} 


int main() 

{ 
cout << "C cl1;" << endl; 
C cl("cel"); 


cout << "C *cpl = &cl1;" << endl; 
C *cpl = &cl; 


cout << 
C *cp2 = new C("new C 1"); 


cout << "cp2 = 0;" << endl; 
cp2 = 0; 
cout << "cpl cp2;" << endl; 
cpl = cp2; 


cout << 
C *cp3 = new C("new C 2"); 


cout << "C c2;" << endl; 
C-e2("e2™): 


cout << "cpl = &c2;" << endl; 
cpl = &c2; 


return 0; 


Figure 26 Test program 


there is an additional action performed, which is to add 
state change information. If a pointer goes out of scope, 
either due to an exception or control implicitly returning 
from the function, the appropriate reference count 
decrement must take place. This is achieved through 
calling the createStateChangeExpression 
method, which will use the saved state change variable to 
create a state change node and insert it in the table in the 
appropriate place, based on the most recent state change 
that occurred within the function. This will cause code to 
be inserted at the end of the function on implicit scope 
termination to call the dtor method defined for the class 
ReferenceCounter, which will decrement the 
reference count. See the appendix for details. 


6. An Example 


To demonstrate the smart pointer transformer extension in 
action, Figure 26 shows a simple program containing 
several pointer declarations and assignments. Figure 27 
shows the resulting execution output after compiling the 
program with the transformation extension. The built-in 
pointers act like smart pointers! 


"C *cp2 = new C(\"new C 1\");" << endl; 


"C *cp3 = new C(\"new C 2\")3;" << endl; 
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Cvel; 

C *cpl = &cl; 

>> Incrementing count for cl to 2 

C *cp2 = new C("new C 1"); 

>> Incrementing count for new C 1 to 1 
cp2 = 0; 

>> Decrementing count for new C 1 to 0 
>> Deleting new C 1 with 0 references 

cpl = cp2; 

>> Decrementing count for cl to 1 

C *cp3 = new C("new C 2"); 

>> Incrementing count for new C 2 to 1 
C~e2: 

cpl = &c2; 

>> Incrementing count for c2 to 2 

>> Deleting c2 with 2 references 

>> Decrementing count for new C 2 to 0 
>> Deleting new C 2 with 0 references 

>> Decrementing count for c2 to -1 

>> Deleting cl with 1 references 





Figure 27 Test Program Output 


transformed tree for: int main() 


{ 


ef _fsm_tab = { OxBEEFDEAD, 4, { 
<offset of cl + 0O>, &C::_ dftdt, 1, 16, 0, 0O }, 
<offset of @1l + 0>, &operator delete, -3, 16, 0, 1 }, 
<offset of @2 + 0>, &operator delete, -3, 16, 0, 2 }, 
{ <offset of c2 + 0>, & C::_ dftdt, 1, 16, 0, 3} } }; 
_ est __es = ({ O, O, & fsm_tab, (long int *) O, O }); 
*ostream: :operator<< (ostream: :operator<<((ostream *) &cout, "C c1;"), endl); 
C cl; *C::C(&écl, "cl") , _@es.__8s8 = 1; 
*ostream: :operator<<(ostream: :operator<<((ostream *) &cout, "C *cpl = &cl;"), endl); 
C *cpl; cpl = &cl; 
*ostream: :operator<< (ostream: :operator<<((ostream *) &cout, 
"C *cp2 = new C(\"new C 1\");"), endl); 
C *cp2; cp2 = (( @1l = ::operator new(16) ? 
_ eB. B8= 2, C::C(@1l, "new C 1") , _es8. ase2=i1:0) , @1); 
*ostream: :operator<<(ostream: :operator<<((ostream *) &cout, "cp2 = 0;"), endl); 


oo rm rm 





cp2 = 0; 
*ostream: :operator<<(ostream: :operator<<((ostream *) &cout, "cpl = cp2;"), endl); 
cpl = cp2; 


*ostream: :operator<<(ostream: :operator<<((ostream *) &cout, 
"C *cp3 = new C(\"new C 2\");"), endl); 
C *cp3; cp3 = (( @2 = ::operator new(16) ? 
_es8._ 8 = 3 , C::C(@2, "new C 2") , _e@e8. se=2: 0) , @2); 
*ostream: :operator<<(ostream: :operator<<((ostream *) &cout, "C c2;"), endl); 
C c2; *C::C(&c2, "c2") , _es.__s8 = 4; 
*ostream: :operator<< (ostream: :operator<<((ostream *) &cout, "cpl = &c2;"), endl); 
cpl = &c2; 
return @3 = 0, (__es8. s2=:=3, :3:~C(&c2, 2, 0) , (__@8._s=0, ::~C(&cl, 2, 0))) , @3; 


Figure 28 Transformed Expressions Without Smart Pointer Extension 
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transformed tree for: int main() 


{ 

_ ef _fsm_tab = { OxBEEFDEAD, 7, { 
<offset of cl + O>, &C::_ dftdt, 1, 
<offset of cpl + O>, 
<offset of cp2 + O>, 
<offset of @0O + O>, 
<offset of cp3 + O>, 
<offset of @1l + 0>, &operator delete, 

{ <offset of c2 + 0>, &C::_ dftdt, 1, 
_ est _es = { 0, 0, & fsm_tab, 


&operator delete, 


mmm mm m& 


C cl; *C::C(@cl, "c1l") , _es._se2il1¥, 


C *cpl; cpl = (cpl = écl, 


(cpl , _es8.__s8=2, cpl))); 


16, 


&ReferenceCounter::dtor, 1, 4, O, 


16, 
(long int *) 0, 
*ostream: :operator<< (ostream: :operator<<((ostream *) &cout, 
cl.rce = 1; 

*ostream: :operator<< (ostream: :operator<<((ostream *) &cout, 
(ReferenceCounter: :increment (static_cast<ReferenceCounter *> (cpl)), 


0, O }, 
&ReferenceCounter::dtor, 1, 4, O, 
&ReferenceCounter::dtor, 1, 4, O, 


1 }, 
2}, 
16, 0, 3 }, 

4}, 


16, 0, 5 }, 


0, 6} } }; 


0 }; 
"C c1;"), endl); 


"C *cpl = &c1;"), endl); 


*ostream: :operator<< (ostream: :operator<<((ostream *) &cout, 


"C *cp2 = new C(\"new C 1\");"), endl); 
C *cp2; cp2 = (cp2 = (( @0O = 
_ es. 8B8=3: 0), @0O) , 
(cp2 , _es._s8 = 3, cp2))); 


*ostream: :operator<<(ostream: :operator<<((ostream *) &cout, 


::operator new(16) ? es. s=4, 
(ReferenceCounter: : increment (static _cast<ReferenceCounter *> (cp2)), 


C::C(@0, "new C 1") , 


"cp2 = 0;"), endl); 


ReferenceCounter: :decrement (static_cast<ReferenceCounter *> (cp2)) , 


(cp2 =O, 


*ostream: :operator<<(ostream: :operator<<((ostream *) &cout, 


ReferenceCounter:: increment (static_cast<ReferenceCounter *> (cp2))); 


"cpl = cp2;"), endl); 


ReferenceCounter: :decrement (static_cast<ReferenceCounter *> (cpl)) , 


(cpl = cp2 , 


ReferenceCounter: :increment (static _cast<ReferenceCounter *> (cpl))); 


*ostream: :operator<< (ostream: :operator<<((ostream *) &cout, 


"C *cp3 = new C(\"new C 2\");"), endl); 
C *cp3; cp3 = (cp3 = (( @1l = ::operator new(16) ? 
_ @8.._ 8 =5: 0), @1) , 


_ eB. B= 6, C::C(@l, "new C 2") , 


(ReferenceCounter::increment (static _cast<ReferenceCounter *> (cp3)) , 


(cp3 , _.es.__s = 5, cp3))); 


*ostream: :operator<<(ostream: :operator<<((ostream *) &cout, 
c2.rc = 1; 


GC c2: 
foo(); 


¥C:3::C(&c2, "c2") , _es8._s8=7, 


*ostream: :operator<< (ostream: :operator<<((ostream *) &cout, 


"C c2;"), endl); 


"cpl = &c2;"), endl); 


ReferenceCounter: :decrement (static _cast<ReferenceCounter *> (cpl)) , 
(cpl = &c2 , ReferenceCounter: :increment (static _cast<ReferenceCounter *> (cpl))); 


return @2 = 0, (__e8.__8 = 


::-C(&écl, 2, 0))) , @2; 


Figure 29 Transformed Expressions With Extension 


Without the transformer extension, the transformed tree 
for the main function would be as shown in Figure 28. 
The first line of the transformed function contains a finite 
state machine table used for exception handling. Each of 
the 4 entries in the table specifies the action to take should 
an exception occur during execution of the function. There 
is an entry for the two local variables requiring 
destruction, along with the dynamically-allocated storage, 
in the order they occur in the function. Most of the 
remaining transformed function 1s fairly self-explanatory. 
The state variable __ es .___es 1S updated to indicate the 
progress made, (in other words the state of the function), 
should an exception occur. The final line handles local 
destructors, updating the state as each destructor is called 
in reverse order of declaration. Note that there are no 


6, C3::3:~C(&c2, 2, 
ReferenceCounter: :decrement (static_cast<ReferenceCounter *> (cp3))) , 
ReferenceCounter: :decrement (static _cast<ReferenceCounter *> (cp2))) , 
ReferenceCounter: :decrement (static_cast<ReferenceCounter *> (cpl))) , 


0), 


( es._spe=5, 

(_ es.__s8 
(__es.__8 
(_ es._sgs=0, 


i ow 
Ww 
= 


state table entries, state changes, or final destruction code, 
corresponding to the built-in pointers, as there is no 
cleanup necessary or possible for them. 


Figure 29 shows the transformed tree with the reference 
counter transformer extension. The first difference is that 
the state table contains entries for each of the three built-in 
pointers, calling the ReferenceCounter: :dtor 
method. In addition, state changes have been added 
throughout the function for the built-in pointer 
declarations. Each declaration or pointer assignment now 
includes the additional smart pointer functionality that was 
added as part of the transformation extension. And finally, 
at the end of the function, the built-in pointers are 
decremented as they go out of scope, in reverse order of 
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declaration. 


7. Further Work 


There are several areas that were not covered by this 
work, most notably temporaries. The problem with 
temporaries is that the API currently does not support 
them very well. There is no model for detecting when a 
temporary goes out of use, which is necessary in order to 
correctly apply reference counting. The current model 
does not include temporaries in the reference count, which 
is sufficient for some, but not all, cases. Consider an 
expression such as cp2 = cpl1++; The initial value of 
cp2 mustbe saved in a temporary prior to the increment 


of cp2 in order to be assigned to cpl, _ so the 
expression would be transformed as follows: 
cp2 = (@0 = cpl , cpl = cpl +1, @O); 


Applying the smart pointer transformation without taking 
into account the temporary would yield an incorrect result 
if the decrement against cpl caused the underlying 
storage to be deleted, resulting in a dangling reference 
being assigned to cp2. The API needs a mechanism for 
allowing a transformation to determine when a temporary 
goes out of use so that, for this example, the appropriate 
reference counting can take place. When such support for 
temporaries is avaiable, the smart pointer implementation 
must also be updated to generate temporaries for the 
modified expressions to avoid multiple evaluation of 
expressions containing side-effects, as discussed earlier. 


Further work also needs to be done in the area of 
non-implicit scope termination, (for example through a 
return statement), and exception handling. The current 
transformation implementation handles reference 
decrementing only for implicit scope termination, through 
the state table additions. However, the API does support 
the capability to handle explicit scope termination, by 
detecting return statements and using the state information 
to determine what needs to be done. With respect to 
exception handling, the current extension implementation 
does work for simple examples, but not for all cases. Due 
to time constraints, we did not pursue these areas, but 
anticipate that the implementation would be fairly 
straightforward. 


8. Conclusion 


C++ smart pointers, while similar to built-in pointers, 
cannot be used interchangeably. Most notably, implicit 
compiler conversions are not supported for smart pointers. 
We have proposed that the “smarts” of smart pointers be 
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added to built-in pointers, and presented the expression 
transformations that would be necessary to implement a 
reference-counting built-in pointer. Using the Montana 
API, we have demonstrated a working example of these 
ideas. 


The Montana API interface has proven to be quite 
complete for the purposes of adding transformation 
extensions. Other work in this area [Car97] supports this 
conclusion. We found the API interface and design to be 
reasonably _ straightforward and __ understandable, 
particularly given the complexity of the problem we were 
attempting to solve, that of modifying compiler-generated 
expressions. Nonetheless, adding a_ transformation 
extension Is not a trivial undertaking, and is more likely to 
be expected of a class library vendor rather than a casual 
programmer. 


While there is some additional work necessary to allow 
full support for a reference counting smart pointer 
implementation, it 1s clear that the interface is quite 
capable of handling such language-level extensions. The 
API definitely need better support for temporaries, both 
those generated by the compiler and by extensions such as 
the one we have demonstrated. However, given the 
flexibility of the interface, this does not seem like a 
difficult design issue. 
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Abstract 


Toba is a system for generating efficient standalone Java 
applications. Toba includes a Java-bytecode-to-C com- 
piler, a garbage collector, a threads package, and Java 
API support. Toba-compiled Java applications execute 
1.5—4.2 times faster than interpreted and Just-In-Time 
compiled applications. 


1 Introduction 


Java [GYT96] is an object-oriented language designed 
by Sun Microsystems that supports mobile code, 1.e., ex- 
ecutable code that runs on a variety of platforms. Al- 
though the language Is interesting 1n its own right, Java’s 
popularity stems from its promise of “write once, run 
anywhere.” Mobile code proponents envision a future of 
location-independent code moving about the Internet and 
running on any platform. 

Java’s mobility is achieved by compiling its object 
classes into a distribution format called a class file. A 
class file contains information about the Java class, in- 
cluding bytecodes, an architecturally-neutral representa- 
tion of the instructions associated with the class’s meth- 
ods. A class file can execute on any computer supporting 
the Java Virtual Machine (JVM). Java’s code mobility, 
therefore, depends on both architecture-neutral class files 
and the implicit assumption that the JVM is supported on 
every client machine. 

Most JVM implementations execute bytecodes via 
interpretation or Just-[n-Time (JIT) compilation, which 
compiles the bytecodes into machine code at run time. 
Thus, Java’s mobility comes at a price, exacted by the 
cost of interpreting or JIT-compiling the bytecodes every 
time the program is executed. These systems incur mod- 
est to severe performance penalties compared to more 
traditional systems that compile source code directly to 


Address: Department of Computer Science, University of Ari- 
zona, Tucson, AZ 85721; Email: {todd, gmt, bridges, jhh, newsham, 
saw } @cs.arizona.edu. 


machine code once. For example, a compiled C pro- 
gram runs 1.5-2.2 times faster than the equivalent JIT- 
compiled Java program, and 2.6-4.2 times faster than an 
interpreted Java program. 

These performance penalties are especially bother- 
some in non-mobile applications that are run many times 
without change. To combat these inherent performance 
penalties we have developed a Java system that pre- 
compiles Java class files into machine code. Our system, 
Toba,! first translates Java class files into C code, then 
compiles the C into machine code. The resulting object 
files are linked with the Toba run-time system to create 
traditional executable files. To distinguish our technique 
from JIT compilation, we have (somewhat facetiously) 
coined the phrase Way-Ahead-of-Time (WAT) compiler 
to describe Toba. Toba compiles Java programs into ma- 
chine code during program development, eliminating the 
need for interpretation or JIT compilation of bytecodes. 
Although we forfeit Java’s architecture-neutral distribu- 
tion, Toba-generated executables are 1.5-4.4 times faster 
than alternative JVM implementations. 

Toba has several advantages over interpretation or 
JIT-compilation. First, because Toba runs way-ahead- 
of-time, rather than just-in-time, the resulting machine 
code can be more heavily optimized to yield more ef- 
ficient executables. Second, because Toba creates a C- 
equivalent to the Java program, the standard C debug- 
ging and profiling tools can operate on Toba-generated 
executables. Third, because Toba executables include all 
class files used by the application, there is no possibility 
of an application suddenly ceasing to execute because of 
a change in available class files. For these reasons we be- 
lieve that WAT-compilation is valuable for the develop- 
ment and distribution of efficient Java programs. 

Toba consists of many components: a bytecode-to-C 
translator, a garbage collector, a threads package, a run- 
time library, and native routines implementing the Java 
API. Toba is a surprisingly small system: the transla- 


1 Lake Toba is a prominent feature on Sumatra, the island just west 
of Java. 


Conference on Object-Oriented Technologies and Systems - June 16-20, 1997 


41 


42 


tor is only 5000 lines of Java; the garbage collector is a 
modestly-altered version of the Boehm-Demers-Weiser 
conservative collector [BW88]; the threads package is 
builton top of Solaris threads; the run-time library 1s only 
6500 lines of C; and the API routines are simply transla- 
tions of Sun’s API class files. Except for dynamic link- 
ing, Toba provides a complete Java execution environ- 
ment. 


2 The Java Virtual Machine 


The Java Virtual Machine (JVM) defines a stack-based 
virtual machine that executes Java class files [LY97]. 
Each Java class compiles into a separate class file con- 
taining information describing the class’s inheritance, 
fields, methods, etc., as well as nearly all of the compile- 
time type information. The Java bytecodes form the 
JVM’s instruction set, and combine simple arithmetic 
and control-flow operators with operators specific to the 
Java language’s object model. Powerful object-level in- 
structions include those to access static and instance vari- 
ables, and those to invoke static, virtual, nonvirtual and 
interface functions. The JVM also includes an exception 
mechanism for handling abnormal conditions that arise 
during execution. 

The JVM also provides facilities for managing ob- 
jects and concurrency. The JVM implements a garbage- 
collected object allocation model, with facilities for ini- 
tializing and finalizing objects. Concurrency is provided 
through a thread abstraction. Threads are pre-emptive 
and scheduled according to priority. A monitor facility 
provides mutual exclusion on critical sections as well as 
thread scheduling through wait/notify primitives. Moni- 
tors are recursive, allowing a single thread to acquire the 
same monitor lock multiple times without deadlocking. 


3 Toba’s Run-Time Data Structures 


Java’s rich object model requires run-time data struc- 
tures to describe each object’s type and methods. We de- 
veloped our data structures with both performance and 
simplicity in mind. They differ in many respects from 
those of Sun’s implementation of Java. For instance, 
Sun’s implementation requires that all object references 
go through a handle, which represents an extra level of 
indirection, an added inefficiency, and an extra compli- 
cation. Toba accesses objects directly. The differences 
are invisible to Java programmers but important to au- 
thors of native methods. 
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3.1 Naming 


Toba attempts to preserve Java names in the C it pro- 
duces, although this isn’t always possible. Java names 
may draw from thousands of different Unicode charac- 
ters whereas C names are limited to just 63 ASCII char- 
acters. Furthermore, some legal Java names such as 
enum and set jmp have special meaning in C. When 
a Java name cannot be used directly as a C name, Toba 
discards non-C characters, adds a hash-code suffix, and 
additionally adds a prefix character if the resulting name 
begins with a digit or other illegal character. 

Java method names always require hash-code suf- 
fixes. Toba translates each Java method intoa C function, 
and these functions share a global namespace. Because 
Java methods may be overloaded among and within 
classes, a hash-code suffix is added to distinguish the 
methods. The suffix encodes the class name, the method 
name, and the method signature. 


3.2 Data Layout 


Java includes eight primitive types: byte, short, int, long, 
boolean, char, float, and double. Each translates into a 
primitive C type. (Note that Java’s “char” type repre- 
sents a 16-bit Unicode value.) 

All other Java types are reference types that subclass 
the root class, java.lang.Object. All reference 
types are translated into aC pointer type. Each reference 
points to an object instance, and all instances of a par- 
ticular class contain a class-pointer to a common class 
structure. Java has two different kinds of objects: array 
objects and ordinary objects. The Toba structure for or- 
dinary objects appears in Figure 1. An ordinary object’s 
class descriptor includes the instance size and a flag that 
indicates it is not an array. The Toba structure for array 
objects appears in Figure 2. An array’s class descriptor 
includes the element size and its flag indicates that it rep- 
resents an array. Array instances contain both a length 
field and a vector of elements. 

Each per-class run-time structure has three parts: 
general information that is needed forall classes (e.g., su- 
perclass information), a method table that contains point- 
ers to virtual functions, and a table of class variables. 
Figure 3 summarizes run-time class-level information 
common to all classes. 

The method table is simply a vector of function point- 
ers and unique method identifiers. The method identi- 
fiers are used when invoking interface functions, which 
must be found at run-time. The structure of the method 
table is typical of statically-bound object-oriented lan- 
guages like Oberon-2 [MW91] and C++ [Str86]. Method 
tables include inherited methods as well as functions de- 
fined by the class itself. 
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Figure 1: Ordinary Object Structure 
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Figure 2: Array Object Structure 


| initialization flag | Determines if the class has been initialized 
other flags Miscellaneous flags including the Array Bit 
class name Pointer to instance of class java.lang.String 
class instance Instance of class java. lang.Class 
superclasses Pointer to vector of superclasses for checking subclass relationship 
interfaces Pointer to vector of interfaces 
referenced classes Pointer to vector of referenced classes 
array class Pointer to array class of current class 

| element class Pointer to element class, if array class 

initializer Pointer to class initializer function 
constructor Pointer to default instance initializer function 

| finalizer Pointer to instance finalizer function 











Figure 3: Fields of Class Descriptors 
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Figure 4: Class/Subclass Structures 
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Figure 5: Array Class Descriptors 
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Class variables exist on a per-class basis, not a per 
instance basis. Toba-generated programs reference class 
variables as externals stored in the class structure. Fig- 
ure 4 shows the class/subclass relationship of class de- 
scriptors. 

Class descriptors for arrays require special handling. 
An array of class X (“X[]’) may be declared by any ar- 
bitrary class that imports X. Similarly, an array of arrays 
of arrays of X, X[][][], may be declared by any class 
that imports X. Descriptors for these array classes must 
be unique—all instances of X [ ][ ][ ] must share the same 
class descriptor. Therefore, these array class descriptors 
must be able to be built at run-time. (It is possible to build 
them at link-time, but we chose to avoid this complica- 
tion.) Figure 5 illustrates the simple relationship between 
the descriptor of a class and the descriptor of an array of 
that class. 


3.3. Referencing Values and Methods 


Toba constructs efficient value and method references in 
C. Assume, for instance, that r is an instance of class 
rect. Table 1 summarizes the way Toba references ob- 
jects and methods in C. Toba-generated C accesses the 
instance variable width as r->width. A virtual func- 
tion call requires an indirection through the method ta- 
ble and requires passing the instance as the first argu- 
ment. Note that method names include hash suffixes. An 
interface call utilizes a table-lookup of the appropriate 
method based on its unique identifier (e.g., 298564082). 
Static methods and class variables do not require an in- 
stance variable. A static method invocation is a simple C 
function call. Class variables are accessed via the class’s 
run-time descriptor. 


4 Code Translation 


Toba translates one class file at a time into a C file and 
a header file. To translate a class file, Toba requires the 
class files for all of the class’s superclasses. To compile 
a class’s resulting C file, header files are necessary from 
itself, its superclasses, and all imported classes. 


4.1 Code Translation 


Within class files, methods are encoded in the JVM’s 
byte-coded instruction set. Toba translates each method 
into aC function. Toba assumes that the class file is valid 
and verifiable, although it does nothing to confirm this 
assumption. 

The JVM instruction set is stack-based. During exe- 
cution, (verifiable) bytecode maintains a stack invariant 
that is critical for translation into efficient C (or native) 
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code: regardless of the previous execution path, at any 
given point in the program, the stack is always in a con- 
sistent state (1.e., the same number and types of values are 
on the stack). For instance, if along one path to a given 
program point, P, the stack is empty just prior to execut- 
ing P, then along all paths the stack will be empty just 
prior to executing P. This invariant means that the depth 
of the stack and the types of its contents at any point in the 
program are fixed. A simple traversal of the bytecode can 
determine this information at compile time. Using this 
information, the Toba translator is able to turn all stack 
accesses into references to simple local variables—one 
per stack location. This eliminates the need for an ex- 
plicit stack or stack pointer. 

Most Java constructs translate simply into bytecode 
for this stack machine. For instance, the middle column 
of Figure 6 gives the bytecode for a=b+c; assuming 
that a, b, and c are the first, second and third local vari- 
ables of the enclosing method. The iloadandistore 
instructions refer to loads and stores of local variables. 
Toba creates aC local variable for each JVM local vari- 
able. 

Figure 6 gives a simple translation of the previous 
Java statement into C. In the example, i1 and i2 refer 
to the first and second elements of the stack, and iv1, 
iv2 and iv3 refer to the first three JVM local variables. 
Once the stack depths are known, Toba generates naive 
code. Tobarelies on an optimizing Ccompiler todocopy 
propagation and register allocation to eliminate useless 
copies and local variables. 

Generating code for each method follows the follow- 
ing outline: 


1. Read the bytecode instructions from the class file 


Compute the stack state at every instruction 


wi 


Note instructions that are exception range entry 
points and assign labels to them 


4. Note jump target instructions and assign labels to 
them 


5. Generate C function header 


6. Generate C code for each instruction 


Computing stack states requires visiting all instruc- 
tions. After computing stack state, Toba translates byte- 
code instructions one at a time. 

The Java bytecode supports both direct (conditional 
and unconditional) branches, as well as indirect jumps. 
Toba computes all potential targets of direct and indi- 
rect jumps, as well as exception handling blocks, in a 
control-flow analysis. (Verifiable bytecodes are guar- 
anteed to be easy to analyze accurately.) Toba emits a 
C label before the executable code for each target in- 
struction. To handle indirect jumps and exception han- 
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Java Reference Type 
rewidth instance variable 
mst Paps) virtual method 
r.clear() interface method 
rect.clearAll() — static method 
rect.nrects class variable 


Toba-generated Reference 

r->width 
r->class->M.flip_r_b79qV(r) 
findinter face (7, 298564082) (7) 
CclearAll b7zk4() 
cl_rect.V.nrects 


Table 1: Toba-generated References (Omitting C Casts) 


Java Bytecode Generated C 





a = bk C= hi boadsZ Lil fe 12V2; 
i1load_3 VOCS aos 


1add 


eS a ae ces 


rStOre. Nav = ake 


Figure 6: Translating a = b + c; intoC 


dling, a giant switch statement wraps each method’s 
generated C code, with each indirect target having its 
own case arm. Thus, indirect jumps translate into C 
code that sets a program counter variable, jumps to the 
top of the switch, and then dispatches on that variable 
to the appropriate chunk of code. Unconditional direct 
jumps become goto’s; conditional direct jumps become 
if (...) goto Lnstatements. As an optimization, 
Toba omits the switch wrapper in the absence of ex- 
ception handling blocks and indirect jumps. 

Figures 7 and 8 show a simple Java method along 
with its translation into bytecode and then into C. The 
naive code generation algorithm has produced several 
more assignments than would a human coder, but mod- 
ern C compilers are good at removing these. 


4.2 Exception Handling 


The Java Virtual Machine supports exception handling in 
a manner similar to Ada [Bar84] or C++ [Str86]. Excep- 
tions are thrown, either implicitly or explicitly, and are 
caught by the closest matching exception handler. Ex- 
ceptions that cannot be caught in a procedure require the 
JVM to unwind the call stack and re-throw the exception 
in the caller’s environment. Re-throwing continues until 
the exception 1s caught. 

Exception dispatching is based on the execution-time 
program counter of the JVM. Toba simulates the program 
counter by assigning values to a local pc variable. It is 
not necessary to set pc for every JVM instruction, but 
only when entering or leaving an exception range (taking 
into account that jumps can enter the middle of a range). 

Toba uses C’s setyjmp and longjmp routines to 


control the call-stack unwinding. For each C func- 
tion that may catch an exception, Toba creates a small 
prologue that calls set jmp to initialize a per-thread 
jmpbuf. The prologue saves the previous jmpbuf 
value in a local structure; epilogue code restores the old 
value before the functionreturns. Toba translates excep- 
tion throwing into long jmp calls that use the jmpbuf. 
Such calls transfer control to the prologue of the nearest 
function that might handle the exception. This prologue 
code simply checks a table to determine if, given the 
type of the exception and the currently active program 
counter, this procedure can handle the exception. If so, 
the target label 1s set to the appropriate handler and exe- 
cution transfers to the switch statement that dispatches 
indirect jumps. Otherwise, the prologue restores the pre- 
vious jJmpbuf, and immediately executes a longjmp 
with this jmpbuf. 


4.3 Class Initialization 


Each Java class may define an initialization routine to be 
run exactly once. Any of the following events can trigger 
initialization: 


e The first creation of an instance of a class. 


e The first invocation of any of a class’s static meth- 
ods. 


e The first read or write of any class (not instance) 
variable. 


In the worst case, each of these operations includes 
checks to determine if the class initializer must be run. 
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class d { 
Staric Ine div(ine a. iit ot 
sas ae a 
return 1; 


Method int div(int,int) 
OF Loade.0 

iload_1 

1div 

istore_0O 

iload_0O 

1return 


ON & WN 


Figure 7: Simple Java Program and Bytecode 


Int: diveiil-3wlen (int pl, Int p2) 


{ int div(int, int) 
Ties. was, sae integer stack 
Pt av cay le integer variables 
ivO = pl; intt variables from params 
ivl = p2; 
LO: Ly Ore itload_O 
d= als tload_] 
Ie ag) idiv 


throwDivisionByZeroException() ; 
Dla al foe 


170) c= al istore D 
1 SO tload_O 
return il; treturn 


Figure 8: Sample Toba Output 


Calls to allocation routines check a per-class initializa- 
tion flag. Static methods include checks in their prologue 
code—no checking is done by the caller. Static-variable 
accesses include checks of the initialization flag. 

Often, these checks are not needed. Toba omits the 
checks for classes that have no initialization routine. 


5 Garbage Collection 


Toba’s garbage collector is based on the freely-available 
Boehm-Demers-Weiser (BDW) conservative garbage 
collector [BW88]. A conservative collector treats ev- 
ery register and word of allocated memory as a potential 
pointer and traces all memory reached from these point- 
ers. Therefore, the BDW collector does not need type in- 
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formation for the memory it manages. This frees Toba 
and native routine developers from concerns about mem- 
Ory management. 

Our modifications to the BDW collector are rela- 
tively minor, affecting about 30 lines of code. First, the 
BDW collector is a mark-and-sweep collector that re- 
quires all threads to be stopped during collection. This 
proved to be expensive in Toba’s thread package (Solaris 
threads), so we optimized the “stop the world” function- 
ality for the single-threaded case. 

Second, the behavior of finalizers and cyclic data 
structures in the JVM are slightly different from those 
supported by the BDW collector. The Java language 
specification (page 231-234 , [GJS96]), allows object fi- 
nalizers to make previously unreachable objects reach- 
able again, thereby “resurrecting” the objects. Although 
the BDW collector supported finalization and resurrec- 
tion of objects, it did not collect cyclic data structures 
containing finalizable objects. We therefore made an- 
other minor modification to the BDW collector to add 
this functionality. 


6 Threads and Synchronization 


The JVM defines a priority-based, preemptive thread 
model that includes synchronization facilities. Toba im- 
plements Java threads using Solaris threads, and uses 
Solaris locks to protect internal critical sections. The 
biggest problem we encountered when implementing 
Java threads is that Java allows threads to both suspend 
each other and to cause other threads to receive an asyn- 
chronous exception, such as thread termination. Toba 
uses UNIX’s signal mechanism to handle these asyn- 
chronous events, causing the receiving thread to either 
suspend itself or throw an exception, as appropriate. The 
problem is that this may cause a thread to block (or even 
die) in the middle of acritical section, leaving the critical 
section locked. To eliminate this possibility Toba uses a 
limited form of roll-forward [MDP96] to allow a thread 
interrupted by a signal to exit the critical section before 
handling the signal. Note that this problem also exists 
with critical sections in the Java code itself; the Java lit- 
erature does not offer much of a solution other than rec- 
ommending limited use of these asynchronous thread op- 
erations. 

Java threads synchronize via monitors. Each object 
and class has a monitor associated with it, and only one 
thread at a time may hold the lock associated with a 
monitor. Condition variables are also provided to allow 
thread scheduling; the standard wait, notify, and broad- 
Cast Operations are supported. 

An unusual feature of Java monitors is that they are 
recursive, 1.e. the same thread may enter a monitor recur- 
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sively without deadlock. This implies that Toba cannot 
implement Java monitors using lock and unlock primi- 
tives directly; instead monitors are a more complex data 
structure containing a lock, a reference count, and the 
identity of the thread holding the lock. If a thread enters a 
monitor whose lock it already holds, the reference count 
is incremented. Similarly, when the monitor is exited 
the reference count is decremented and the lock only re- 
leased when zero 1s reached. If a thread leaves the moni- 
tor to wait on a condition, the lock is released and the ref- 
erence count cleared; when the thread subsequently re- 
enters the monitor the lock is re-acquired, and the refer- 
ence count is restored. 

To reduce synchronization overhead, Toba has an op- 
timized monitor implementation for single-threaded ap- 
plications. Entering and exiting monitors only affects 
their reference count; the monitor locks are not used. 
Should another thread be created, the original thread first 
locks all monitors that have a positive reference count, 
thus ensuring mutual exclusion now that there is more 
than one thread. 


7 Performance Results 


7.1 Methodology 


We tested Toba’s performance using using both appli- 
cation benchmarks and micro-benchmarks. The appli- 
cation benchmarks test the overall system performance, 
while the micro-benchmarks isolate the performance of 
individual language features (e.g., exception handling, 
thread switching, etc.). 

We compared Toba’s performance to three other sys- 
tems: Sun’s interpreter (JDK version 1.0.2), Sun’s JIT 
compiler system for Solaris, and the Guava JIT compiler 
(version 1.0 beta 6), by Softway Pty, Ltd. We compared 
against the Sun interpreter because it is the reference 1m- 
plementation of Java, and against the Guava JIT com- 
piler and the Sun JIT compiler because they are the only 
other compilation systems for SPARCs of which we are 
aware. We ran benchmarks on a Sun SPARCStation- 
20 with 128 MB of memory and two Model 61 Super- 
SPARC processors. C code was compiled using Sun’s 
commercial C compiler with full optimization (-xO4 - 
xcg92). 

The Guava JIT compiler, the Sun JIT compiler, and 
Sun interpreter must all do more work at run time than 
Toba to execute benchmarks. Both systems must dynam- 
ically load each class file, and the JIT compilers must 
compile each method before it can be run. The micro- 
benchmark times do not include the time to load class 
files, while application benchmarks do include this time. 


7.2 Application Benchmarks 


Table 2 describes the application benchmarks. Figure 9 
shows the execution times of the benchmarks on the 
three systems, normalized to the Toba time. Each data 
point represents the average of ten runs of the bench- 
mark. The JIT system results include the time to com- 
pile the benchmark. The Toba-generated benchmarks are 
1.5—4.2 times faster than those same benchmarks running 
under other systems. Toba-generated code runs 2.6—4.2 
times faster than programs running under the JDK in- 
terpreter, and 1.5—2.5 times faster than the JIT compil- 
ers. This speedup results in a tangible improvement in 
the time to complete the benchmark; the JavaLex bench- 
mark, for example, improved from 159 seconds on JDK 
and 80 seconds on Guava to only 45 seconds on Toba. 
The average execution times of the benchmarks, plus 
standard deviations, are given in Figure 10. 

Toba-generated code is faster than Sun’s interpreter 
because compiling class files removes the overhead of 
interpretation and of dynamic loading. Toba-generated 
code 1s faster than the JIT systems because Toba does not 
incur code generation costs atrun time, and, possibly, be- 
cause the C compiler optimizes code more aggressively 
than do the JIT compilers. For stand-alone applications 
that do not rely on dynamic loading, Toba provides large 
performance benefits over other systems. 


7.3. Micro-benchmarks 


Table 3 describes the micro-benchmarks used to iso- 
late the performance differences in the systems. These 
benchmarks are an expanded version of the UCSD Java 
Microbenchmarks [GP96]. 

Table 4 shows results of running the benchmarks 
on each system. For accurate timing, each micro- 
benchmark was iterated in a loop until the total execution 
time was at least 5 seconds. This varied between 100 and 
100,000,000 iterations, depending on the benchmark. 

The results show that Toba outperforms the other sys- 
tems on almost all benchmarks. For example, Toba is 
12-29 times faster than JDK on the arithmetic and class- 
access benchmarks; this is directly attributable to JDK’s 
use ofan interpreter, as Guava and the Sun JIT are nearly 
as fast as Toba on these benchmarks. 

Toba is also usually 0.9-14 times as fast as the other 
systems at handling exceptions. This is because Toba 
does not explicitly unwind the stack when an excep- 
tion is thrown. Instead, Toba implements exception han- 
dling via goto or setjmp/long jmp, depending on 
whether the handler is within the same method or not. 
This makes exception handling in Toba extremely fast. 

Synchronization is also fast in Toba, particularly in 
single-threaded programs because Toba optimizes mon- 
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JavaLex Lexical analyzer generator that translates regular expressions | Specification that includes 
into finite-state machines that are subsequently translated into | 77 patterns 
Java 

JavaCUP LALR(1) parser generator that translates context-free gram- | Grammar that includes 24 
mars into push-down automata that are subsequently trans- | terminals, 32 nonterminals, 


lated into Java and 65 productions 


javac Sun’s Java compiler that translates Java source programs into | Toba source files consisting 
espresso Translates Java source programs into class files (bytecode) Toba source files consisting 
Toba Bytecode-to-C translator described in this paper Toba’s 18 class files 
meneame nes Larristyes 


Table 2: Application Benchmarks 
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me Toba 

o 8 Cc JDK 1.0.2 
£ mes Sun JIT 
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JavaLex JavaCup Javac Espresso Toba 
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Figure 9: Normalized Application Timings 











Benchmark Toba JDK | JDK/Toba Sun JIT | Sun JIT/Toba Guava | Guava/Toba 
(sec.) (sec.) (sec.) | (sec.) _ 
JavaLex || 44.7+ 0.3 198.9 + 2.4 87.b02#1.0 2 80.0+1.1 


38 | of 20 | sori fe 
-java-cup | 212005 | 54£006[ 26] 342002]  16|/532004] 25° 
[—javac 10703 ff 348203] 33] 204201; 19] i94z02] 18 
7] or] 22 Jar 
2.9 : 1.5 || , 21 










3 , 
| espresso || 5.3+0.2 |] 22.340.3 4.2 || 11.94 0.07 ) i U2 
toba | 19-340.1 ff] 566205 29 2874017 159396504 f 2.1) 


Figure 10: Application Benchmark Timings 
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Arithmetic Benchmarks 


| 
| 
Multiply two integers 
| 
| 


add-int | Add two integers 
multiply-int Jil 
add-double Add two double-precision floating point numbers 


Multiply two double-precision floating point numbers 


Class Access Benchmarks 
Read an integer instance variable 


| Invoke a method defined in the current (this) object 
Invoke a method defined in a different object - 


multiply-double 


instance-var 
method-local 
method-remote 


method-interface Invoke an interface method 

Exception Handling Benchmarks 
exception-caller | 
exception-remote 


Throw an exception caught by method’s caller 
Throw an exception caught by a method ten levels up the call chain 
| Throw and catch an exception past an exception handler that does not catch the 
thrown exception 
Synchronization Benchmarks 
Enter a synchronized block in a single-threaded program 
| Call a synchronized method in a single-threaded program 





exception-bypass 


sync-block-single 
| sync-method- 

single 

sync-block-multi | Entera synchronized block in a multi-threaded program 


Call a synchronized method in a multi-threaded program 
multi 


! _ Miscellaneous Benchmarks 
null-loop Once around an empty loop 7 
array-assign Assign to an element of an integer array 


| thread-yield Perform yields in 3 separate threads 


Table 3: Micro-Benchmarks 
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itor accesses in this situation. Although single-threaded 
programs need no synchronization, they may still make 
use of library classes that use synchronization. 

Toba performs slightly worse than 
Guava on the interface-method invocation benchmark, 
the integer multiplication benchmark, the instance vari- 
able benchmark, and the array assignment benchmark. 
Toba also performs slightly worse than the Sun JIT com- 
piler on thread yields, since the Sun JIT system does not 
implement kernel threads or true concurrency. Toba per- 
formed better than any of the other systems on all other 
programs, large and small. 


7.4 Code Size 


Toba emits naive C code and relies on an optimizing C 
compiler to do register allocation, copy propagation, and 
branch elimination to produce efficient code. Table 5 
indicates the sizes of the benchmark programs in bytes 


of class file, lines of C, and bytes of object code. Ob- 
ject code sizes do not include the Toba run-time system, 
which is a dynamic shared library. This library contains 
915,000 bytes of code. 


8 Project Status 


The Toba system currently runs under Solaris on SPARC 
workstations. The system includes all of the Java API 
except for dynamic linking and the graphics and applet 
libraries. Table 6 summarizes the sizes and implementa- 
tion languages of its various components. 

We intend to port Toba to additional architectures 
and operating systems. Porting Toba will require thread- 
specific changes to the run-time system and garbage col- 
lector. It will also require OS-specific changes to the run- 
time system. The bytecode translator and header files 
will change only minimally. 

Toba is the first piece of the larger “Sumatra” project. 
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Table 4: Micro-Benchmark Timings 


Benchmark || Class-file | Emitted C Code | Object File 
nt Eeses 
“231,816 
TavaCUP || 119,094 | 50,297 | 446,816 
javac || 508,916 | ____127,678 | 869,756. 
espresso [| 295,281 _ 83,098 
23,570 





Table 5: Program Sizes 


Component | Implementation | _—_ Size 
Language | (Lines) 


Bytecode Translator 4723 


Run-time Support 4130 | 
on API Native Routines 2809 | 
Toba specific Garbage Collection 





Table 6: Implementation Detatls 
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The Sumatra project is exploring many aspects surround- 
ing the efficient execution of mobile code, with empha- 
sis on efficient implementations of the Java Virtual Ma- 
chine. We developed Toba to bootstrap our development 
of the JVM API, threads, and garbage collector, as well 
as to have fast Java applications. 


9 Related Work 


Java is arelatively new programming language and vir- 
tual machine. We know of no published results de- 
scribing implementation and performance characteris- 
tics. Popular-press reports and commercial advertise- 
ments indicate that many development efforts for Just- 
In-Time (JIT) compilers are underway or have recently 
completed, but the available information 1s sketchy. 

Compiling higher-level languages to C is not new. 
Many language systems leverage existing compilers and 
use C as an intermediate language in the compilation 
process. Systems for Smalltalk [Git94], SR [And82], 
Scheme [Bar89], Icon [Wal91], Forth [EM96], SML 
[TAL90], Pascal [Gil90], Cedar [ADH 89], and Fortran 
[FGM S90] are well known. For traditionally compiled 
languages like Pascal and Fortran, translation to C im- 
proved portability. For Scheme, Forth, and Icon, trans- 
lation removed interpretation overhead. Similarly, Toba 
removes interpretation overhead from Java programs. 

Several other projects for compiling Java bytecodes 
to C are currently underway. 3 2c [And96] is a restricted 
bytecode to C compiler, currently ported to several plat- 
forms. 3 2c (version | beta 5) does not support threads, 
monitors, or network resources. In addition, native rou- 
tines cannot throw exceptions in j32c. Toba does not 
have these restrictions. 

Vortex[DDGt96] is another project that compiles 
Java bytecodes to C. Vortex provides front ends for 
C++, Cecil, Modula-3, and Java. These languages are 
compiled to a common internal representation, and C 
code is generated from this representation. The Vor- 
tex project studies the effectiveness of optimizations for 
object-oriented languages. The Vortex project reports 
that Java programs speed up by as much as a factor of 8 
as aresult of these aggressive optimizations. Toba does 
not currently perform any of these optimizations. Vor- 
tex does not support threads, which has a global impact 
on performance. No published information 1s available 
about other details of Java run-time system support from 
Vortex. 

Jolt [Sir96] also compiles Java bytecodes to C. 
Jolt generates a C function for some methods in a 
class file, and then generates a new class file with these 
methods marked as native. Method overloading is not 
supported, and Jolt cannot compile class initialization 


methods. Jolt produces class files that are used by the 
standard Java interpreter. Toba produces stand-alone ex- 
ecutables. 


10 Availability 


The Toba system is freely available via anonymous ftp. 
All distribution information 1s described on the World 
Wide Web at 


http://www.cS.arizona.edu/sumatra/toba/. 
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A A Larger Example 


Figure 11 expands on the example shown earlier by 
adding exception handling. An implicit branch (from the 
try block to the return) has also been added. 

Figure 12 gives Toba’s translation into C code. Ex- 
ception handling has enlarged the code significantly, and 
the effect is especially noticeable because the original ex- 
ample was so small. Besides the boilerplate code that 
is the same for all exception-catching methods, there are 
also assignments to pc that maintain the JVM program 
counter and case labels used for dispatching a caught 
exception. 


class d { 
Statice “mnt -div (ane. i, 
Gry 
oS 34 
} catch (ArithmeticException e) { 
De ie 


Trt. -)e ot 


+ 


) 
return: i; 


) 


Method int diy (int) Ant) 
0 iload_0 
iload_1 
idiv 
istore_0 
goto 10 
pop 
iload_l 
9 istore_0 
10 iload_0O 
Lb Aneturn 
Exception table: 
from to target type 
0 4 7 <Class 
java.lang.ArithmeticException> 


on PWN FP 


Figure 11: Sample Java Program and Bytecode 
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Int div.1i_3WIeN(Int pl, Int p2) 


{ 
Static struct handler htable[] = { 
&cl_java_lang.ArithmeticException.cC, 


struct mythread *tdata; 
jmp_buf newbuf; 

void: *oldbut; 

volatile int pc; 


Ine ecoes 

Ere 5 

Object a0, al, a2; 

ines O. Si 

volatile Int’ 1V0). avi; 
ivO = pl; 
ivl = p2; 
tdata = mythread(); 
oldbuf = tdata->jmpbuf; 
EGE ==5'0; 


if (setjmp(newbuf)) { 
sthread_got_exception(); 


CATCH: al = tdata->exception; 
if ((tgt = findhandler(htable, 
longjmp(oldbuf, 1); 
} 
tdata->jmpbuf = newbuf; 
TOP: switch(tgt) { 
LO: case 0: 
pers 0; 
Li v0 tload_O 
2 ey ls tload_]l 
defo hae) idiv 
throwDivisionByZeroException(); 
deb eae fee 
PvO. = istoreO 
pe = 4; 
GorouL2; goto 10 
Gl case l: 
pes= 7; 
el we ac tload_] 
Os: al istoreD 
Tig case 2: 
A co iload.O 
TA = eis treturn 
goto RETURN; 
} 
RETURN: 
tdata->jmpbuf = oldbuf; 


return rv; 


1, 


int div(int, int) 


exception handler list 
goto L] if0 <pce<4 


thread data potnter 
jump buffer 

potnter to previous buffer 
JVM program counter 
jump target 

return value 

reference stack 

integer stack 

integer vartables 


inittalize vartables from parameters 


set thread data pointer 
save old jmpbuf pointer 
dispatch first to entry point 
setupjump buffer 
exception was caught: 
load exception value 

al,. “pe)):° :20)) 

no handler; pass upward 


find handler 


register jump buffer for thread 


dispatch entry, ret, or exception 


set pc for exception handling 


reset pc on leaving exception range 


reset pc after catching exception 


restore previous jump buffer 
return result 


Figure 12: Sample Toba Output 
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Abstract 


This paper discusses a realization of object persistence 
in a CORBA-based distributed system. In our approach, 
persistence of CORBA objects is accomplished by the in- 
tegration of the ORB with an ODBMS. This approach is 
not limited to pure object-oriented database systems, as 
the ODBMS may be a combination of a relational DBMS 
and an object-relational mapper. The design and im- 
plementation of an Object Database Adapter that inte- 
grates an ORB and an ODBMS with C++ bindings is 
presented. The ODA uses delegation (rather than inher- 
itance) to connect user-provided implementation classes 
and IDL-generated classes. Only the user-defined parts 
of CORBA objects are actually stored in a database. 
Their IDL-generated parts are dynamically instantiated, 
in transient memory, by the ODA. Persistent relation- 
ships between CORBA ob jects within a server are not re- 
alized at the CORBA level, but at the level of implementa- 
tion objects. Database traversals and queries can there- 
fore be executed at ODBMS speeds. The paper discusses 
in some detail a number of implementation issues, such 
as caching. ODA support to local transactions, ODA in- 
terfaces, and CORBA server organization are also exam- 
ined. 


1 Introduction 


In spite of its remarkable successes in promoting stan- 
dards for distributed object systems [14], the Object 
Management Group (OMG) has not yet settled the is- 
sue of object persistence in the Object Request Broker 
(ORB) environment. The Common Object Request Bro- 
ker Architecture (CORBA) specification [7] briefly men- 
tions an Object-Oriented Database Adapter that makes 
objects stored in an object-oriented database accessible 
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part of the Sunrise Project. 

tPartly supported by a fellowship from the National Research 
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through the ORB. This idea is pursued in the Appendix B 
of the ODMG standard [2], which identifies a number 
of issues involved in using an Object Database Manage- 
ment System (ODBMS) in a CORBA environment, and 
proposes an Object Database Adapter (ODA) to realize 
the integration of the ORB with the ODBMS. 

Possibly because this proposal was perceived by many 
as biased towards object-oriented databases, and hence 
distant from the mainstream database world, no further 
OMG specifications have contemplated the ODA ap- 
proach. Instead, a Persistence Object Service (POS), 
designed to accommodate the widest possible variety of 
data stores, was introduced in [8]. So far POS failed to 
deliver its promise. In response to this fact, the OMG 
recently issued a request for proposals for POS version 
2.0: 


“While the industry posses many products 
from OMG members that could be considered 
to be in this space, it is clear that virtually 
none have compliant POS implementations in 
their product roadmaps. Most have taken the 
route of point integrations with ORB prod- 
ucts.” ({11], page 20) 


Meanwhile, recognition that the ODA approach is 
not exclusive to object-oriented databases seems to have 
grown in the industry. Object-relational mappers — 
systems that map C++ classes/objects into relational ta- 
bles/tuples — have been employed to make relational 
databases appear as object-oriented ones. Because such 
mappers implement an ODBMS interface on top of a re- 
lational system, they extend to relational databases the 
applicability of the ODA approach. 

The benefits of integrating ORBs and ODBMSs in- 
clude: 


Database Heterogeneity. ORB/ODBMS integration al- 
lows the construction of distributed object data- 
bases that can be heterogeneouseven with respect to 
the DBMS software running on the database server 
nodes. 
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“IDL views”. Access to database objects through IDL 
interfaces does not require knowledge of the data- 
base schema: changes in the schema are transpar- 
ent to IDL clients. Interfaces can be defined to ex- 
pose only data items that certain users are permit- 
ted to read or update. IDL interfaces to database 
objects can therefore play a role analogous to re- 
lational views, both for data independence and for 
authorization purposes. 


Language Heterogeneity. Databases can be accessed 
by CORBA clients written in any language for 
which a mapping from IDL is defined. 


Security. The ORB's remote method invocation mecha- 
nism requires much less trust in the client than the 
data-shipping approach employed by pure object- 
oriented DBMSs. 


This paper discusses the design and implementation 
of an ODA that integrates an ORB and an ODBMS with 
C++ bindings. For our purposes, an ODBMS 1s a system 
with programming interfaces similar to the ones speci- 
fied in [2]: it may be a pure object-oriented DBMS (an 
OODBMS), or a combination of a relational DBMS and 
an object-relational mapper. 

An ODA based on the ideas presented here was de- 
veloped as part of the Sunrise Project! at the Los Alamos 
National Laboratory (LANL). This adapter has been used 
by the TeleMed system [4] since mid 1995, and is cur- 
rently employed by other LANL projects as well. We 
have implemented it for two ORBs, Orbix and VisiBro- 
ker for C++, with ObjectStore as the underlying ODBMS 
in both cases. Even though these implementations were 
aimed at a non ODMG-compliant ODBMS, we report 
our experience in ODMG terms whenever possible. 


1.1 The Case foran ODA 


ODBMSs integrate database capabilities with an object- 
oriented programming language. They implement per- 
sistent memory, a single-level store abstraction of the 
memory hierarchy. An ODBMS with C++ bindings pro- 
vides a persistent address space for C++ objects, with 
heap-style allocation/deallocation. ODBMS program- 
mers manipulate persistent C++ objects in the same way 
they manipulate objects in the transient heap. 
Nevertheless, a CORBA server implemented in C++ 
cannot simply place in persistent memory the objects 
it implements. To have the status of a CORBA object, 
a C++ object must be registered with the ORB, which 
keeps a per-server-process table of active objects. The 
details of how C++ objects are registered as CORBA 


objects are not fully specified by the current release of 


'Seehttp://www.acl.lanl.gov/sunrise/sunrise.html. 
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CORBA.” In existing ORBs, CORBA objects are regis- 
tered upon creation. The following approaches are cur- 
rently used by ORB implementations: 


1. A server may create CORBA objects only via calls 
to the ORB, usually to the function BOA: : create. 


2. A Server can instantiate CORBA objects directly. 
The constructor of a CORBA object executes IDL- 
generated code that registers the object with the 
ORB. 


On the other hand, the ODBMS provides its own over- 
loaded form of operator new. It requires persistent ob- 
jects to be created by this operator. If the ORB enforces 
approach 1 above, then there is clearly no way of placing 
a CORBA object in persistent memory. If the ORB sup- 
ports approach 2, one could naively use the overloaded 
form of operator new to instantiate “persistent CORBA 
objects”. This would not work, because the construc- 
tor of a persistent object is invoked only when the ob- 
ject is added to the database: “persistent CORBA object- 
s” stored by other processes (including previous runs of 
the same server program) would not be registered with 
the ORB. As far as the ORB is concemed, these objects 
would not be active — no requests would be delivered to 
them.? 

To make the ORB and the ODBMS work together, an 
additional component is necessary. Driven by incoming 
requests, such a component should activate objects that 
lie dormant in persistent memory. To allow on-demand 
activation of dormant objects, it must ensure that object 
references handed out to CORBA clients contain infor- 
mation on the location of the corresponding objects in 
persistent memory. Hence this component has to be re- 
sponsible for the generation and interpretation of refer- 
ences to persistent objects. In the OMG ORB architec- 
ture these responsibilities belong to an Object Adapter. 


1.2 TheRole ofthe ODA 


The primary role of the ODA 1s to provide CORBA 
servers with an application-independent way of making 
CORBA objects persistent. This includes ensuring that 
references to persistent objects are themselves persistent. 
In CORBA, persistence of object references means that 
“‘a client that has an object reference can use it at any 
time without warming, even if the (object) implementa- 
tion has been deactivated or the (server) system has been 
restarted” [7]. 


*The underspecification of a number of server-side issues led to 
server portability problems (9}, which the OMG is about to solve {1}. 

7CORBA distinguishes object activation (activation of individual 
objects within a server) from implementation activation (server activa- 
tion, usually performed by ORB-provided daemons). 
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With persistence of object references, it makes perfect 
sense for aclientto store an object reference for later use. 
References to persistent CORBA objects implemented 
by server X can be stored by server Y (a client of server 
X), thereby enabling the construction of ORB-connected 
multidatabases. In such a multidatabase, references to 
remote objects are used to express relationships between 
CORBA objects implemented by different servers. 


Distributed transactions, in an ORB-connected multi- 
database, should be supported by a TP monitor that im- 
plements the Object Transaction Service (OTS) specified 
by the OMG [8]. In the absence of this service, the ODA 
has the additional role of ensuring that operations on per- 
sistent objects are encompassed by local transactions.’ It 
interacts with the ODBMS to start and commit (or abort) 
database transactions. 


1.3 Organization of this Paper 


The next section motivates and presents the general de- 
sign of the ODA. Section 3 discusses implementation is- 
sues; Section 4 considers transactions; Section 5 exam- 
ines the ODA interfaces and their typical usage; Section 
6 mentions related work; and Section 7 presents conclud- 
ing remarks. 


2 Design Decisions 


Our perspective is the one of a third-party implemen- 
tor, with no access to ORB and ODBMS internal inter- 
faces. Accordingly, our ODA 1s an add-on to the ORB's 
native Object Adapter (OA), rather than a replacement 
for it. Figure 1 shows how it fits into the integrated 
ORB/ODBMS environment. 


Note that the ODBMS is depicted as a separate en- 
tity holding persistent objects. This representation ex- 
poses the three-tiered nature of the ORB/ODBMS envi- 
ronment: an object implementation — the middle tier — 
is at the same time a client of the ODBMS and a server 
to CORBA clients. For simplicity, in a subsequent fig- 
ure we omit the ODBMS and represent persistent objects 
within the CORBA server. The reader should keep in 
mind that “persistence within an object implementation” 
is a simplified representation of the architecture in Fig- 
ure I. 


4Several OODBMSs, including ObjectStore, do not yet support the 
resource manager interface required by OTS. This service might also be 
absent simply because a particular application does not need distributed 
transactions, 


2.1 What to Place in Persistent Memory 


A CORBA object has two parts: an IDL skeleton and 
an user-defined part.> The skeleton consists of ORB- 
specific datamembers andmember functions, all of them 
mechanically generated from an IDL specification. It is 
an instance of a skeleton class, a server-side dispatcher 
generated by the IDL translator. The user-defined part is 
the implementation object, aninstance of an implementa- 
tion class provided by the server writer. The implemen- 
tation object encompasses the data members and member 
functions actually defined by the object implementor.® 

The data members in the user-defined part of a 
CORBA object are relevant to the application, the ones 
in the skeleton part are relevant to the ORB only. If we 
employ an ODBMS to make CORBA objects persistent, 
we should certainly keep their implementation objects in 
persistent memory. Should we also place their skeleton 
parts in persistent memory? An obvious reason for not 
doing so is waste of database space, specially in the case 
of fine-grained objects.’ Stronger reasons are: 


ORB independence. Keeping ORB-specific data mem- 
bers in persistent memory ties the database to a 
particular ORB implementation. As ORB products 
evolve, these data members may change with ORB 
releases. Databases with ORB-specific information 
would then have to go through a schema evolution 
process. 


Performance. Assuming that CORBA objects are ref- 
erence counted,® the skeleton part of a CORBA 
object holds its reference count, which is updated 
by the primitives duplicate and release. Plac- 
ing reference counts in persistent memory means 
encompassing these primitives by update transac- 
tions. Every operation that receives or returns a ref- 
erence to a persistent object would then require an 
update transaction, because parameter passing in- 
volves duplicate and release calls. 


Only the user-defined parts of CORBA objects should 
be placed in persistent memory. As the ODA activates 


>We are not considering the case of CORBA objects implemented 
with the Dynamical Skeleton Interface. 

© Terminology could be better here, as implementation object is eas- 
ily confused with object implementation. The former is an instance 
of an implementation class, the latter is the OMG term for a CORBA 
server. We prefer the vocabulary adopted by [5} — servant for im- 
plementation object, servant class for implementation class — but re- 
frain from using it, because the new Portable Object Adapter specifica- 
tion [1] has assigned another mcaning to the word servant. 

7Besides ORB-specific data members, the skeleton part of a 
CORBA object typically has a pair of hidden vbase and vtable point- 
ers for each interface class in the object's inheritance chain up to 
CORBA: :Object. 

8 Although CORBA does not specify such implementation details, 
most (if not all) ORB implementations keep a reference count per ob- 
ject. 
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Figure 1: The Object Database Adapter. 


and deactivates objects, it should dynamically instanti- 
ate and release their skeleton parts, allocated in transient 
memory. These observations lead us to a clear choice 
with respect to the relationship between skeletons and 
implementation objects. 


2.2 Delegation, Not Inheritance 


Figure 2 shows the alternatives commonly used to con- 
nect the parts of a CORBA object. In the inheritance 
approach, the object implementor derives implementa- 
tion classes from IDL-generated skeleton classes. In the 
delegation approach, also known as tie approach, in- 
stances of [DL-generated skeleton classes are called tie 
objects, or simply ties. Each tie holds a reference to 
an implementation object to which it delegates opera- 
tions. While inheritance imposes identical lifetimes to 
both parts of aCORBA object, delegation allows imple- 
mentation objects to outlive their skeleton objects. We 
therefore choose delegation as the interface implementa- 
tion approach supported by the ODA. 


2.3 Pseudopersistence 
Our decisions can be summarized as follows: 


e The ODA supports persistent CORBA objects i1m- 
plemented with the delegation approach. 


e CORBA servers keep only implementation objects 
in persistent memory. 


e The ODA is responsible for dynamically instantiat- 
ing and releasing transient ties to persistent imple- 
mentation objects, so that full CORBA objects are 
available whenever they are needed. 


Even though “persistentCORBA objects” are not fully 
kept in persistent memory, to their clients they appear 
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as long-lived objects. Accordingly, we call this scheme 
pseudopersistence. In what follows, a pseudopersistent 
tie, or simply p-tie, 1s a transient tie to a persistent imple- 
mentation object. 


As any tie, a p-tie has a data member that specifies the 
implementation object to which the tie delegates opera- 
tions. In a regular tie, this data member is a C++ pointer 
or reference. In a p-tie, it must be an ODBMS reference 
(d_Ref), for it points to an implementation object in per- 
sistent memory. 


When a p-tie is instantiated, one must initialize its 
d_Ref data member. To support the instantiation of a 
p-tie given a CORBA object reference, the ODA embeds 
ad_Ref to an implementation object into every CORBA 
reference it generates. This embedding takes advantage 
of the id (also known as ReferenceData) field of the 
object reference. The id, an octet sequence opaque to 
the ORB core, contains identification information local 
to the server in which the CORBA reference was gener- 
ated. References to p-ties are generated and interpreted 
by the ODA, which embeds d_Refs into their ids. 


Figure 3 illustrates the pseudopersistence scheme. A 
request to a dormant object arrives through the ORB core 
(1), causing an upcall to an ODA-provided object activa- 
tion function. The id field of the target object reference 
is passed as a parameter to this function. This id con- 
tains a stringfied d_Ref to a persistent implementation 
object. The ODA extracts the d_Ref from the id and 
passes it as an argument to an instantiation function (2), 
which constructs the target CORBA object as a p-tie to 
the implementation object specified by the d_Ref. The 
incoming request then reaches the target object as an up- 
call through the IDL skeleton (3). At the end of the op- 
eration, another upcall to the ODA (4) causes the target 
object to be released. 
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Figure 2: Interface implementation approaches. 


In Figure 3, the object activation upcall happens be- 
cause the target of an incoming request is a dormant ob- 
ject. Upcalls also happen in the case of dormant objects 
referenced by request parameters, or by strings passed to 
string_to_object. 

Pseudopersistence should be understood in the context 
of the architecture in Figure 1. Persistent implementation 
objects do not live within a CORBA server, as the sim- 
plified representation in Figure 3 may suggest, but in a 
database server. Multiple CORBA servers (middle-tier 
processes) may be clients of a database server; they may 
or may not run in the same node as the database server. 
Moreover, a persistent implementation object may be 
shared by multiple p-ties, each in the address space of 
a different middle-tier process.” 


3 Implementation Issues 


The ODA is implemented as a library that uses and ex- 
tends the services of the native OA. It requires changes 
on the IDL translation process, which must become 
ODA-aware. These changes, as well as the actions of 
the ODA library, are examined below. 


3.1 IDL Translation Issues 


e Any tie class has a data member that references the 
implementation object to which ties delegate oper- 
ations. This data member is usually a C++ pointer 
or reference. In the case of a p-tie class, however, it 
must be a d.Ref. 


?To exemplify: consider a persistent CORBA object implemented 
by an Orbix server whose activation mode is per-client-process, and 
suppose that multiple clients are concurrently using this object. Every 
client interacts with its own middle-tier process, a distinct execution 
of the same server program. Each middle-tier process has a p-tie that 
incamates the persistent CORBA object. All these p-ties share the same 
persistent implementation object, which is managed by the ODBMS. 


e Code to support the management of p-ties by the 
ODA library must be generated within every p-tie 
class. In our implementation, p-tie constructors and 
destructors perform ODA-related actions. More- 
over, each p-tie class makes available to the ODA 
library a static function for p-tie instantiation. 


The constructor of a p-tie class embeds into the p-tie’s 
id a stringfied d_Ref to the p-tie's implementation ob- 
ject. It also registers the p-tie with the ODA library; 
the p-tie will be eventually unregistered by its destruc- 
tor. The p-tie instantiation function receives a d_Ref toa 
persistent implementation object and creates a new p-tie 
to this object. 

Special translation requirements do not necessarily 
mean another IDL translator. Our ODA implementa- 
tion actually employs the IDL translator provided by the 
ORB, complementing it with macros. The object im- 
plementor annotates the server code with ODA-defined 
directives, which macro-expand into p-tie class defini- 
tions. No changes are made to any files generated by the 
IDL translator: ODA directives are placed only in user- 
written files, and typically within server headers. In what 
follows, an ODA-generated function (ODA-generated 
class) is a function (class) defined by the macro expan- 
sion of an ODA directive. 


3.2. ODA Actions 


e The ODA library receives object activation upcalls 
from the native OA, forwarding each such upcall to 
the appropriate p-tie instantiation function. 


e Atthe end of every operation, after any results were 
marshaled into a reply message, the ODA library 
issues release calls on all p-ties instantiated while 
the current request was being serviced. 


Because the number of implementation objects in a 
database is potentially very large, a CORBA server can- 
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Figure 3: The pseudopersistence scheme. 


not keep in-memory ties to all the persistent implementa- 
tion objects it touches during its execution. The last item 
above addresses the need of releasing p-ties from time 
to time. Each p-tie is instantiated with a “net reference 
count” of zero — an initial reference count of one, plus 
a pending release call, to be performed by the ODA 
at the end of the operation. Unless the server code is- 
sues duplicate calls onthem, p-ties have short lifetime: 
they exist while a request is being serviced. Whenever a 
discarded p-tie is needed again, an equivalent to it will 
be instantiated by an object activation upcall. 


3.3. Caching P-ties 


Releasing all p-ties at the end of every operation appears 
unreasonable, since the ODA only needs to ensure that 
these p-ties will be eventually released. Postponing their 
destruction would avoid the costs of successive p-tie re- 
instantiations. Our ODA implementation actually caches 
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the last N p-ties it instantiated, where JN is a configurable 
parameter. At the end of every operation the ODA brings 
the number of p-ties down to NV + 0d, keeping the most 
recent ones. (The 6 accounts for any duplicate calls 
that might have been issued by the server code.) 

Caching p-ties makes sense if the ODBMS ensures 
that their d Ref data members remain valid across trans- 
actions. So far we have ignored database transactions, 
this topic will be discussed in Section 4. Let us as- 
sume, by now, that transactions are started and commit- 
ted (or aborted) by means external to the CORBA server, 
and that each operation is encompassed by an individual 
transaction. 

Does a d_Ref from transient to persistent memory re- 
tain its validity across transactions? The ODMG stan- 
dard leaves the answer to the discretion of the ODBMS 
implementor. In most ODBMSs, such a reference cannot 
be used in between transactions, but does remain valid 
across transactions. This being the case, the ODA should 
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cache p-ties. 

With caching, the CORBA server must have a way 
of forcing the removal of objects from the cache. Ac- 
cordingly, the ODA provides a function that receives a 
CORBA: : 0bject_ptr and immediately deletes the corre- 
sponding p-tie. This function, ODA: : Delete, is intended 
to be called by destructors of persistent implementation 
objects, with the purpose of avoiding dangling p-ties. 


3.4 Converting Implementation Objects to 
CORBA Objects 


The ODA must provide the server code with the means 
for obtaining a CORBA object given its implementation 
object. For each association 


(interface, implementation_class) 


there is an ODA-generated function that takes a d_Ref to 
an implementation object and returns a reference (of type 
interface_ptr) to the corresponding CORBA object. To 
avoid multiple p-ties to an implementation object, this 
function is not implemented as a mere call to p-tie in- 
stantiate. It first checks it if a p-tie to the implementation 
object already exists in the server's address space, then 
it retums a duplicated reference to either an existing 
p-tie or a newly instantiated one. 

A non-standard bind function, present in various 
ORBs, could be used to perform the check mentioned 
above. Given a d_Ref to an implementation object, one 
would convert it to string and obtain an id, which would 
then be passed as an argument to bind. The ODA does 
not use this approach. Instead, it keeps pairs 


(d_Ref, p-tieaddress) 


in its own table of active p-ties, which it hashes by 
d_Refs with a hash function provided by the ODBMS. 


3.5 Non-standard ORB and ODBMS Fea- 
tures Employed 


The ODA relies on the delegation approach, which 1s 
mentioned — but not mandated — by the current revi- 
sion (2.0) of CORBA. Orbix and VisiBroker are exam- 
ples of commercially available ORBs that support dele- 
gation. Both admit direct instantiation of ties, automati- 
cally registering newly instantiated ties as active CORBA 
objects. The new Portable Object Adapter (POA) spec- 
ification [1] fully standardizes the delegation approach. 
It requires compliant ORB implementations to support 
both the inheritance and the delegation approach. 
Because CORBA 2.0 describes object activation in 
very general terms, existing ORB implementations vary 
widely on their support to object activation. The ODA 


builds upon the native OA's object activation capabilities. 
Its Orbix implementation uses a LoaderClass instance; 
the VisiBroker implementation uses an Activator. The 
new POA specification completely defines object activa- 
tion. With the POA, future ODA implementations can 
employ a standard facility, the InstanceActivator in- 
terface. 

Various ORBs provide non-standard “event handling” 
or “request/reply intercepting” mechanisms. The ODA 
needs such a facility both to release p-ties and to man- 
age database transactions in the absence of OTS (see 
Section 4). Its Orbix implementation uses a Filter; 
the VisiBroker implementation uses an EventHandler. 
The OMG has recently introduced request level intercep- 
tors [10] as an extension to the ORB core, and 1s actively 
working to complete the specification of this facility. 

From the ODBMS, the ODA requires a means of 
converting d_Refs to strings and vice-versa. Although 
supported by many ODBMSs, this feature is not in 
the ODMG standard. Caching of p-ties requires more: 
d_Refs must remain valid across transactions. 


4 ‘Transactions 


Any access to persistent memory has to be performed 
within a transaction, Leaving to implementation objects 
the responsibility of starting and committing (or abort- 
ing) transactions is not an option, because accesses to 
persistent memory happen both before and after these 
objects' methods are called: 


e In order to delegate an operation to its implementa- 
tion object, a p-tie must access persistent memory. 
The p-tie must dereference its d_Ref data member, 
which points to persistent memory. 


e Marshaling of operation results into a reply message 
may involve accesses to persistent memory. 


Usage of OTS [8] would ensure that not just the user- 
provided implementation code, but also request dispatch- 
ing and parameter marshaling code, would be executed 
within transactions. Since OTS interacts directly with 
the local resource manager (the ODBMS), transactions 
would be started and committed (or aborted) by means 
external to the CORBA server. 

If OTS 1s absent, the ODA must take the responsibil- 
ity of starting and committing (or aborting) focal trans- 
actions. Not with the aim of performing distributed two- 
phase commit, but just to ensure that a transaction will 
be active whenever an operation is dispatched, and will 
remain active till the operation results are marshaled into 
a reply message. We did not have OTS, so this was our 
scenario. 
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4.1 Support to Local Transactions 


The ODA manages local transactions by employing 
ORB-specific “event handling” or “request/reply inter- 
cepting” facilities. Its default transaction mode Is frans- 
action per operation: an “*incoming request pre- marshal” 
handler starts a transaction as soon as a request arrives, 
an “outgoing reply post-marsha!” handler ends the trans- 
action just before the reply is sent. An operation imple- 
mentation may specify if the current transaction will be 
committed or aborted at the end of the operation. By de- 
fault, the ODA commits the transaction. Under control 
of the server code, the ODA may also switch to another 
transaction mode, which allows multiple operations to be 
grouped into a single transaction. 

Because ObjectStore requires the transaction type 
(read-only or update) to be specified when a transac- 
tion starts, update operations must be registered with the 
ODA. Registration of update operations is typically done 
by the server mainline. By default, the ODA starts read- 
only transactions. In the case of operations previously 
registered as update operations, it starts update transac- 
tions. 


5 ODA Interfaces and Usage 


The CORBA server interacts with the ODA through a 
very small API. Besides ODA-generated functions that 
return an interface_ptr given a d_Ref and vice-versa, 
there are just a few static functions available to the server 
code: 


e ODA: : initialize 


e ODA: :register_update_ops 

e ODA: :Delete 

e ODA: :multi_op_transaction_mode 
e ODA: :abort_transaction 


e ODA: :commit_transaction 


Note that there is no specific function to create or activate 
a persistent CORBA object: object activation may occur 
as aside effect of the conversion of ad_Ref into CORBA 
object reference. 

Given an interface class X and an implementation class 
X_i to which X delegates operations, the function 


X_ptr ODA_X_i_to_X(const d_Ref<X_i>&) ; 


translates a d.Ref<X_i> into the corresponding X_ptr. 
This function, defined at the file scope, is generated by 
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the ODA directive that “ties together” X and X_i. A mem- 
ber function of the ODA-generated p-tie class performs 
the reverse translation (to d-_Ref<X_i>). 

The ODA is not an intrusive presence in the pro- 
gramming environment. In our experience, the vast ma- 
jority of ODA calls is performed to obtain an inter- 
face._ptr from a d.Ref. Except for these, ODA calls 
are relatively rare in the server code. ODA: : initialize 
is called once, by the server's mainline. Calls to 
ODA: :register.update_ops typically appear in the 
server's mainline only, and would not be necessary in the 
case of an ODMG-compliant ODBMS. ODA: : Delete is 
invoked from destructors of persistent implementation 
objects. In the default transaction mode, user-provided 
methods do not normally call transaction management 
functions. 


5.1 Server Organization 


Persistent relationships between CORBA objects within 
a server are actually realized by relationships between 
their corresponding implementation objects. When 
traversing database relationships or performing a data- 
base query, the server code deals only with persis- 
tent implementation objects, not with full CORBA ob- 
jects. Such a traversal or query is therefore executed at 
ODBMS speeds. Consider, for example, the case of an 
operation that performs a search for a particular object 
within a collection of objects. The whole search is per- 
formed at the ODBMS level, without CORBA-activating 
any of the objects of the collection. Its result, a d_Ref 
to particular implementation object, is then converted to 
CORBA object reference and passed back to the client. 
When the server code calls the ODA to perform such 
a conversion, it obtains a duplicated reference to a 
CORBA object managed by the ODA. Whether this ob- 
ject was just activated or was already in the ODA cache 
is irrelevant to the server code, which in either case as- 
sumes the responsibility of releasing the reference. 

Persistent relationships between CORBA objects in 
different servers are realized via stringfied CORBA ref- 
erences stored in persistent memory. These references 
must be explicitly converted back to its native form for 
usage. Note that any database containing CORBA object 
references is ORB-dependent, because these references 
are ORB-dependent. ORB independence is lost when we 
move on to an ORB-connected multidatabase. 


3:2 


Consider the following IDL interfaces: 


Inheritance Issues 


interface X { ... }; 


USENIX Association 


USENIX Association 


interface X1 : X { 
i 
interface X2: X { 


ei 


interface Y { 


readonly attribute X x; 


it 


Interface X defines operations that are common to both 
X1 and X2. Attribute x of Y has interface type X, but its 
most derived interface is either X1 or X2. 

A natural organization for the corresponding persis- 
tence-capable implementation classes’? would be: 

class X_i : public d_Object { 


// abstract class 


}; 
class Xi_i : public X_i { 
}; 
class X2_i : public X_i { 


}; 
class Y_i : public d_Object { 
public: 

X_ptr x(); 


private: 
d_Ref<X_i> x_i; 


}% 


X_i is an abstract class: any instance of this class 1s 
an instance of either X1_i or X2_i. Class Y_i holds an 
ODBMS reference to an instance of X_i in its private data 
member x_i. The attribute accessor Y_i: :x () returns a 
CORBA reference to the object whose implementation is 
X21: 

Note, however, that there is no ODA-generated func- 
tion that takes a d_Ref<X_i> and returns an X_ptr. The 


l0We adopt the convention of naming implementation classes by ap- 
pending an “_i” to the corresponding interface names. 


ODA provides this conversion function only when the in- 
terface skeleton and the implementation class are tied to- 
gether by delegation. This is never the case for an inher- 
ited implementation class, such as X_i. In the example 
above, there are ODA-generated conversion functions 
from d_Ref<X1_i> to Xi_ptr and from d_Ref<X2_i> to 
X2_ptr. 

ODA users solve this problem by defining a virtual 
member function, say get_X_ptr(), in class X_i. This 
function, declared as pure virtual in X_i, is redefined by 
the derived classes X1-i and X2_i as below: 


Xeptr Xi-a:+eet-xX2ptrO 4 
return ODA_X1_i_to_Xi(d_Ref<X1_i>(this)) ; 


Xoptr X221:get_X_ptrQ: 1 
return ODA_X2_i_to_X2(d_Ref<X2_i>(this)) ; 


} 


If the inheritance chain were longer, all abstract im- 
plementation classes would define get_X_ptr() as pure 
virtual. 


6 Related Work 


Work recently concluded at the OMG, in the context of 
the ORB Portability Enhancement RFP [9], has resulted 
in a Portable Object Adapter [1] that will reduce the 
ODA dependencies on non-standard ORB features. Ear- 
lier ORB portability proposals [5, 3] included a Server 
Framework Adapter (SFA) and an ODMG model for 
SFA. Our pseudopersistence scheme is essentially a re- 
alization of the ODMG model for SFA, as outlined in the 
Appendix C of [3]. 

A number of ORB and ODBMS vendors has an- 
nounced plans for the integration of their products; some 
of these integrated solutions are already being deliv- 
ered. Probably the first one was Iona Technologies's 
Orbix+ObjectStore Adapter (OOSA) [6], whose beta re- 
lease became available by late 1995. Since then, Ionahas 
integrated Orbix with Versant, and has announced plans 
for integrating Orbix with O2 and with Persistence. 

Iona's OOSA takes advantage of the particular way 
CORBA objects are laid out by the ORB. In Orbix, not all 
data encapsulated by a CORBA: : Object instance appears 
directly in its data members. Instead, a data member of 
CORBA: :Object points to an auxiliary object. Some of 
the “logical” data members of CORBA: : Object are actu- 
ally in this auxiliary object. The reference count is one 
of them. 

Unlike the ODA, which stores only implementation 
objects, OOSA actually stores CORBA objects in Ob- 
jectStore databases. A CORBA object, however, is not 
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stored in their entirety: to avoid the performance penalty 
of having reference counts 1n persistent memory, OOSA 
does not store the auxiliary object in the database. In- 
stead, it dynamically instantiates auxiliary objects as per- 
sistent CORBA objects are made available in Object- 
Store's client cache. When such an auxiliary object is in- 
stantiated, the corresponding CORBA object is inserted 
into the per-process table of active objects maintained by 
Orbix. This approach allows persistent CORBA objects 
to be implemented either by inheritance or by delega- 
tion. It also allows object relationships to be expressed 
in terms of CORBA objects, not just in terms of imple- 
mentation objects. Its disadvantages are some waste of 
database space, ORB-dependent databases, and the per- 
formance penalty of object activations triggered by data- 
base accesses. 


7 Concluding Remarks 


We have presented the design and implementation of an 
ODA that allows execution of database traversals and 
queries at the full speed of the underlying ODBMS. Only 
what needs to be persistent 1s kept in persistent memory; 
ODA users are not forced to store ORB-specific infor- 
mation persistently. Databases are ORB-independent un- 
less the user explicitly places ORB-specific data (such as 
stringfied object references) in persistent memory. Fi- 
nally, the ODA design appears to be general enough 
to be applicable to any ODBMS. ObjectStore's virtual 
memory-based architecture makes it different from all 
other ODBMSs in many aspects. That the ODA design 
can be described in ODMG terms, and yet be imple- 
mented for ObjectStore, is strong evidence of its appli- 
cability to any ODBMS. 

The ODA's pseudopersistence scheme appears to be an 
optimal solution for integrated ORB/ODBMS environ- 
ments in which object relationships are mostly confined 
within a CORBA server. In such a scenario, there 1s no 
reason to express database relationships at the CORBA 
level, as they are much more efficiently realized at the 
level of implementation objects. 

The motivation for representing database relationships 
at the CORBA level might arise in the context of an 
ORB-connected multidatabase with many cross-server 
references. Expressing persistent relationships between 
objects in diffrent servers via stringfied CORBA refer- 
ences placed in persistent memory may be inconvenient 
in this case. Consider, for example, a situation in which 
it would be desirable for a server to have a persistent and 
homogeneous collection of object references, whose ele- 
ments may refer to either local or remote objects. This is 
not possible in the pscudopersistence scheme. Instead of 
a uniform collection, two distinct sub-collections must be 
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used: one with d_Refs to local implementation objects, 
other with stringfied CORBA references to remote ob- 
jects. Intra-server references and inter-server references 
could be unified if the Object Adapter provided support 
for persistently representing both at the CORBA level. 
To be useful, this unification should allow transparent use 
of stored CORBA references to invoke methods on pos- 
sibly remote objects. Note, however, that incurring the 
cost of such a unification — the performance penalty of 
expressing intra-server references at the CORBA level — 
makes sensc only if cross-server references occur much 
more than intra-server references. 


References 


[1] BEA, DEC, Expersoft, HP, IBM, ICL, IONA, Nov- 
ell, SunSoft, and Telefénica I+D. ORB Portabil- 
ity Joint Submission, Draft 14. OMG Document 
orbos/97-04-04, April 1997. 


[2] R. G. G. Cattell, editor. The Object Database Stan- 
dard: ODMG-93, Release 1.2. Morgan Kaufmann, 
1996. 


[3] DEC, Expersoft, HP, IBM, ICL, IONA, Novell, 
SunSoft, and TelefénicaI+D. ORB Portability Joint 
Submission, Draft 5. OMG Document orbos/96-12- 
02, December 1996. 


[4 


mew 


D. W. Forslund, R. L. Phillips, D. G. Kilman, and 
J. L. Cook. Experiences with a distributed virtual 
patient record system. Journal of the American 
Medical Informatics Association, Symposium Sup- 
plement, 1996. 

[5] HP, IBM, Novell, and SunSoft. Server Framework 
Specification (ORB Portability Submission). OMG 
Document orbos/96-05-03, May 1996. 


—— 


[6 


bead 


Iona Technologies. Object+ObjectStore Adapter 
— Beta Release Documentation. Dublin, Ireland, 
1995. 


[7 


—— 


Object Management Group. The Common Ob- 
ject Request Broker: Architecture and Specifica- 
tion. Revision 2.0, July 1995. 


[8 


— 


Object Management Group. CORBAservices: 
Common Object Services Specification. Revised 
Edition, March 1995. Updated November 1996. 


[9 


heed 


Object Management Group. ORB Portability En- 
hancement RFP. OMG Document 95-06-26, June 
1995. 


USENIX Association 


[10] Object Management Group. CORBASecurity. 
Version 1.1, OMG Document Numbers 96-08-03 
through 96-08-06, July 1996. 


[11] Object Management Group. Persistent Object Ser- 
vice, version 2.0 — Request For Proposal (Draft). 
OMG Document orbos/97-04-07, December 1997. 


[12] F. C. R. Reverbel. Object Database Adapter Pro- 
grammer's Guide and Reference Manual.  Ad- 
vanced Computing Laboratory, Los Alamos Na- 
tional Laboratory, Los Alamos, NM, August 1996. 


[13] F. C. R. Reverbel. Persistence in Distributed Ob- 
ject Systems: ORB/IODBMS Integration. PhD the- 
sis, University of New Mexico, Computer Science 
Department, Albuquerque, NM, May 1996. 


[14] S. Vinosky. CORBA: Integrating diverse applica- 
tions within heterogeneous environments. JEEE 
Communications, 14(2), February 1997. 


USENIX Association Conference on Object-Oriented Technologies and Systems - June 16-20, 1997 65 


USENIX Association 


Obtuse, a scripting language for migratory applications 


Robert P. Cook 
Dept. of Computer and Information Science 
University of Mississippi 
www.cs.olemiss.edu/~bobcook; bobcook@cs.olemiss.edu 


Abstract 

This paper discusses the design and implementation of 
Obtuse, a scripting language for migratory applications. 
The paper reviews the pertinent ActiveX technology 
that provides the runtime object infrastructure. Then we 
discuss the Obtuse object model and present an 
Overview of the language. Next, several sample 
programs are used to illustrate the concepts. Finally, we 
review some of the problems with DCOM, based on our 
experience. 


Keywords: scripting language, Obtuse, migratory 


applications, Obliq, distributed systems 


1. Introduction 


Obtuse was designed by the author, inspired by 
Cardelli’s Obliq [1] and Bharat’s Visual Obliq[2] 
systems, and implemented as part of a summer research 
appointment at Microsoft Corporation. The goal was to 
explore the potential of several core ActiveX 
technologies [3,4,5], including COM (Component 
Object Model), Automation, and DCOM (Distributed 
Component Object Model). 


Obtuse is unique in two respects; first, in its synergistic 
use of ActiveX technology and second, in its ability to 
transfer the state of a Visual Basic form from one 
machine to another. A migratory application is one 
that can transfer program state (including the user 
interface) to different Internet locations under program 
control. Other terms used in the literature are mobile or 
transportable agents. Obtuse is also a scripting 
language; that is, it defines sentences capable of being 
executed as fine-grained code fragments. 


As an example of transferring UI state from one 
machine to another, consider the Visual Basic (VB) 
form in Figure 1, which consists of an edit control and a 
button. The form is used in a simple, routing-slip 
application. The user can type a command line, such as 
“obtuse poll m] m2 m3” to initiate execution. The list (a 
routing slip) represents a sequence of computers to visit. 
The DCOM infrastructure is utilized by the Obtuse 


runtime to implement the remote activation that is 
necessary to support the routing-slip application. 


The form is circulated to the machines in the order 
listed. The accumulated comments are available to each 
recipient and the completed form, with all comments, is 
returned to the source machine. When one user clicks 
the OK button, the form is moved to the screen of the 
next computer in the list. 


& ROVING POLLER 





Figure 1. User Interface for a Roving Poller 


We refer to programs, such as the routing-slip example, 
as in-your-face applications. When one user clicks the 
routing slip’s OK button, the document appears 
instantaneously on the screen of the next recipient. 
Most word processors also support routing slips by 
using e-mail as the transport mechanism. However, 
users are only notified if an e-mail client is executing at a 
site and if they decide to read their mail. 


In the routing-slip application, the code, together with 
its execution state, can also migrate from one machine 
to another with the form. Obtuse implements program 
migration by exposing threads and contexts (a 
program’s global variables) as COM objects. 


Figure 2 lists a simple Obtuse program that moves itself 
from one machine to another. In Obtuse, a running 
program is a collection of COM objects, which support 
an Automation interface. The sample program creates a 
thread and a context object on a remote machine. Next, 
the Fork method of the running-thread object is invoked 
to clone the program’s state. At this point, there are 
two threads executing, one locally and one remotely. 
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However, since they have duplicate contexts, any object 
references in one are duplicated in the other. As a 
result, the remote thread can access objects on the 
parent machine in a location-opaque fashion. 


// Note that variables are “typed” at runtime 
var me, thread, context, where; 
me := self, // a reference to the executing thread 
print( “parent process starting”); 
where := “louie.cs.olemiss.edu’: 

// create a remote thread 
thread := object("Bob.Thread", where), 

//create a remote context 
context := object("Bob.Context", where); 

//duplicate myself at “where” 
if me.Fork{thread, context}=1 then 
print(“ parent stopping”); 


quit; // | returned to parent, it 
quits 
end, 
print( “we made it to”, where); 
quit; 


Figure 2. A Sample Obtuse Program 


Obtuse is unique in that the mechanisms to support the 
runtime (threads, contexts, stacks) are all COM objects. 
Another unique aspect is that Obtuse uses Visual Basic 
forms to implement user interfaces. These forms can 
also be marshaled in order to transport their state from 
one machine to another. Since Obtuse !s based on 
COM, it can be used to manipulate any COM 


Automation object, which includes all Office 
applications and ActiveX controls. The DCOM 
infrastructure supports the remote location and 
activation of COM objects 


Other features of Obtuse include support for script- 
based execution, support for multiple threads, runtime 
strict typing, and a _machine-invariant program 
representation. An Obtuse program can consist of a 
sequence of expressions with no variables, a series of 
statements on a set of global variables, or a collection of 
procedures. Furthermore, an Obtuse program can 
invoke an Automation object’s methods and access its 
properties at runtime; it is not necessary to “import” or 
“include” interfaces. 


Obtuse does not support compile-time type binding. 
The type checking in expressions and procedure calls ts 
performed at runtime. However, Obtuse 1s “strict”; that 
is, types must match exactly on operations such as 
comparison or multiplication. Variables are bound to a 
type on runtime assignment. From that point on, until 
another assignment occurs, that variable must be type 
compatible with every operator that is applied to It. 
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Programs are UNICODE-based and compile to a 
machine-invariant representation that encodes the 
source program. That is, the object code can be 
inverted to recover the original source, including 
comments. 


The paper first presents an overview of ActiveX 
technology. Then we discuss the Obtuse object model 
and present an overview of the language. Next, several 
sample programs are introduced to illustrate the 
concepts. Also, we present some _ performance 
measurements for Obtuse/ActiveX. Finally, we review 
some of the problems with DCOM based on our 
experience. 


2. ActiveX—-COM and DCOM 


The two most important aspects of ActiveX for 
scripting support are its implementation of dynamic 
method binding and invocation, as well as _ self- 
describing types. Dynamic method binding is the 
technology (Automation) that enables Visual Basic 
applications to manipulate Office documents, such as 
spreadsheets or slide presentations. It also enables 
HTML scripting support (VBScript) in Microsoft's 
Internet Explorer 3.0. 


The dynamic, or late, binding technology enables an 
object to expose methods and properties for use by 
other objects. The technology also supports the lookup 
of method and property names and a mechanism to build 
and execute a procedure call at runtime. It is a separate 
set of code from COM. 


Self-describing types (or variants as they are termed in 
COM), are the key, underlying representation for data 
types in the Visual Basic language common to VBScript 
and VB. In the next sections, we present an overview 
of COM and DCOM. 


2.1 COM — Component Object Model 


An “object” in COM typically has a document type, 
such as .xls, .ppt, doc. Each document type can have a 
registered server. For example, winword.exe is the 
server for *.doc objects. Objects also have a registered 
application name (e.g. “Microsoft Word Document’) 
and a globally-unique identification number, called a 
class id (CLSID). 


The association between a class and its server ts 
maintained in a persistent store called the registry. 
There 1s one registry per machine and there ts currently 
no “yellow pages” server to support object lookup for 
distributed services, although one is reputed to be 
available shortly. 
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The COM model ts language independent; it may have a 
concrete implementation in a particular language, such 
as C++, but the relationship between COM and different 
languages is orthogonal. For example, Obtuse is 
implemented in C++ but it uses COM objects, which are 
implemented in Visual Basic, to define its user interface. 


A COM object is defined by its support for a collection 
of interfaces, each is which ts tagged by a 128-bit 
globally unique interface identifier (IID). The interfaces 
that an object supports can vary over time; and the 
interfaces need not have any other relationship (such as 
inheritance). There is only one requirement te. 
EVERY COM INTERFACE MUST INHERIT 
FROM THE [Unknown INTERFACE, which 1s listed 
in Figure 3. 


virtual HRESULT QueryInterface ( 
InterfaceID & riid, 
LPVOID * ppvObj)=0; 
virtual HRESULT AddRef(void) = 0; 
virtual HRESULT Release(void) = 0; 


Figure 3. COM [Unknown Interface 


The power of COM derives from several of the 
requirements satisfied by the [Unknown implementation 
First, QueryInterface must be used to obtain an object 
handle for an instance variable x (as in x.Queryinterface) 
that supports a particular interface (identified by the 
InterfaceID argument). If the object x does not support 
the interface, the HRESULT returmed indicates an error. 
In C++, a COM object handle ts a “pointer to a pointer 
to a vTable”. The vTable 1s generated in C++ because 
the interface 1s “pure virtual”, as are all COM interfaces. 


Since every interface is required to inherit from 
[Unknown, any object handle can be used to retrieve a 
handle for any interface that the object supports by 
calling QuerylInterface at any time. Further, the object 
handles are reference counted. QueryInterface 
increments an object’s reference count and so does 
AddRef. A Release call decrements an object’s 
reference count. 


2.2 DCOM - Distributed COM 


DCOM extends COM in a number of ways. First, 
objects can be remotely activated and a handle returned 
to the activating site. The returned object handle can be 
used by a program in a location-opaque fashion; that 1s, 
the programmer need not be aware of the object’s 
location. Second, DCOM imposes location, security and 
identity restrictions on COM objects. Each site has total 
control over who can activate an object, how objects are 
activated, and with what permissions object servers can 


execute. Third, DCOM implements reference counting 
across machine boundaries and garbage collection. 
Finally, DCOM automatically remotes calls to COM 
interfaces that are supported by remote objects. For 
user-defined interfaces, an IDL compiler must be used 
to generate proxy stubs for the client/server sides. 
Typically, both stubs are included in a single DLL. 


2.2.1 Remote activation 


The DCOM method to activate (cause its server to be 
loaded) a remote object is CoCreateInstanceEx. The 
arguments to the method are the object’s class id, a 
machine name, and a list of interfiace ids. 


Machines are identified using the naming scheme of the 
network transport layer. By default, all UNC (\\chairpc) 
and DNS names (“chaircom” or “135.9.19.33”) are 
supported. Object search is restricted to a single 
machine at present. DCOM has no notion of distributed 
scope or of distributed search paths. 


To optimize network performance, the CoCreate call 
may specify a list of interface ids. Thus, N object 
handles can be retrieved in a single round-trip to the 
server site. Conceptually, this is analogous at runtime to 
the “import javalang.*” convention in Java, which can 
be used to import all of the classes in a package at 
compile-time. 


2.2.2 Access control 


Access to objects can be regulated under program 
control using the NT security API; however for most 
Obtuse users, the utility program dcomenfg is the point 
of control. This program lists the application objects 
that are “registered” on a particular machine. The 
Location, Security, and Identity of each object can be 
separately controlled. The Location options are “run 
here” or “run there”. The latter option supports 
forwarding a requested activation from one computer to 
another. The Security option supports editing the 
access control lists (ACLs) for activation, access and 
configuration. NT provides very fine-grained access 
control so that individual users, or groups, can be 
specified. 


The Identity option designates the protection domain in 
which a server is executed. The choices are the domain 
of the interactive user, the launching user, a particular 
user, or a system service. For example, the “particular 
user” option can be used to solve the “game 
accounting” problem, which ts to let a user run a game 
program that can write its list of winners to a file that 1s 
not accessible to one of the players. The appropriate 
protection domains can be created by having one 
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DCOM object (player’s domain) to play the game 
communicating with another DCOM object (game’s 
domain) to record the scores. 


2.2.3 Reference counting 


As we discussed in Section 2.1, COM defines a 
mechanism to reference count object handles. If a 
program fails, any cross-process links must be broken to 
properly release objects. Similarly for DCOM, the 
system must account for inter-machine links and must 
break links when processes terminate or fail. For 
distributed systems, there are also the possibilities of 
node crashes and communication outages. The DCOM 
implementation addresses these problems. 


2.2.4 Remote procedure call 


DCOM automatically remotes inter-node calls on COM 
interfaces. The arguments are marshaled through the 
normal remote procedure call (RPC) mechanism. RPC 
on user-defined interfaces requires the use of the IDL 
compiler to generate client- and server-side proxy stubs. 


The automation interface (IDispatch) can be used to 
“late bind” a procedure call; that is, a program can build 
a procedure call at runtime. The automation interface 


provides the object-access infrastructure for any 
scripting language, such as Obtuse, VBScript, 
JavaScript or AppleScript. 


COM supports the registration of type libraries that 
describe an object’s properties and methods (also 
argument lists and return values). As a property 
example, a button object might have BackgroundColor 
and Text properties, which could be accessed or 
modified remotely using Obtuse. The [Dispatch interface 
includes methods to “query” for the id of a method or 
property name and then to “invoke” that method or 
“access” that property. The Automation runtime builds 
the argument list in a format that 1s compatible with the 
target language and handles the call/return processing. 


2.2.5 Variant data 


Another aspect of the automation solution is a 
“universal” data type termed a variant. A variant is a 
“union” of about 40 different base types that also 
includes arrays of those types, and arrays of arrays. An 
array can be created with homogeneous elements of a 
particular type or with variant-type elements, each of 
which can be of any type. 


IDispatch and [Unknown object handles are two of the 
possible variant-record base types. Since [Dispatch is 
one of the “builtin” COM interfaces, it 1s remoted 
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automatically by DCOM. In Obtuse, all argument lists 
to procedures, return values, and property values are 
encoded as variant data. 


3. Obtuse Language Overview 


The Obtuse system consists of a compiler and an 
interpreter, and a collection of COM objects. The 
compiler’s output is a UNICODE text string that 
encodes the source program, including comments. The 
executable can be inverted to produce the original 
source program. Thus, after a program is initially 
compiled, there is only one representation, which can be 
used for both execution and symbolic debugging. 


Obtuse supports only one data type (variant), so a 
variable declaration is just a list of identifiers. The type 
is implicit. A form of type checking is supported based 
on the notion of assignment-typing. Basically, every 
assignment statement binds a new type to an identifier as 
well as a new value. Expression evaluation ts type 
checked at runtime. There is no implicit conversion as 
in VB; that is, type checking is “strict”. 


In Obtuse, the “object” built-in function maps a 
registered object name at a particular machine to an 
object reference. For example, the function call 
object(“Bob.Thread”, “foo.univ.edu”) would check 
the registry on the designated machine and then load the 
server if necessary. 


Once an object reference is obtained, the program can 
manipulate the properties of that object or invoke its 
methods. Assignment of object references copies the 
reference, not the value, even if the assignment crosses 
machine boundaries. The DCOM reference counting 
infrastructure tracks each copy. For assignment of other 
variant values, including arrays, Obtuse copies the value. 
The rule ts simple: sharing can only be accomplished 
through COM objects. 


Obtuse implements a common set of statements such as 
if, loop, for, case, in addition to variable and method 
declarations. Pointer, structure and class declarations 
are not supported. A qualified reference can be used to 
access an object’s properties. Since an object’s methods 
are dynamically bound using the [Dispatch automation 
interface, the compiler cannot perform checking for 
undefined names or mismatched argument lists. To 
facilitate some checking, calls to Obtuse procedures are 
delineated with the traditional “( )” and calls to an 
object’s methods use “{ }”. 


As mentioned earlier, Obtuse programs are encoded as 
UNICODE stnngs. Sufficient information is retained in 
the encoding to invert the object code to the source. 
The opcodes were designed to use a character encoding 


USENIX Association 


USENIX Association 


so that program fragments could be embedded in 
documents, sent as mail messages, or be applied as 
drag-and-drop operators on user-interface objects. 
Figure 4 lists several example encodings. The blank, 
tab, and new-line opcodes are no-operations. 


The “ opcode designates a constant. Constants are 
translated at runtime so the opcode includes a type 
designator, the length of the string, and the text 


constant. This is not very efficient but tt does avoid 
representation issues, such as _ for _ floating-point 
numbers. Small integers are encoded as individual 


opcodes. The opcode design also took into account the 
requirements for the next version of Obtuse in which 
type modules, such as Complex numbers, could be 
called upon to parse their own constant representation. 


—Sode Fragment {Encoding __ 
print (3/24/76), 
print(3+45); sss} 3. "L0245 + pO q | 
if a>3 then y := 6: Y3 LI 3: > 1005 

6 S2 Jols 


Li 2 : > 1005 


8 $2 J007 
Y5 9 S2 Y6 


OpCode a Key 
Load Constant 


Y | Syntax Marker | 
L Load Variable 
| S| Store Variable 


Compare 















elsif a>2 then 















_else y ‘= 9; end; 





| Forward Jumps 





Figure 4. Program Encoding Example 


The Y opcodes encode the syntax of the source 
program. Even the comments in the source are 
encoded, but the comment opcode is treated as a no-op 
at runtime. The compiler attempts to generate code for 
branch instructions so that syntax markers are not 
included in loops. 


4. Obtuse Object Model 


The initial Obtuse tmplementation supports FORM, 
FILE, MUTEX, THREAD, CONTEXT, and STACK 
objects. FORM objects are Visual Basic forms, which 
can contain any VB control. Each FORM object 
represents one VB form. Since there are hundreds of 
different VB controls, the Obtuse user interface model 
has abroad range of capabilities. As a result, forms can 
be constructed as the user interface for almost any 
application. 


The FORM interface is implemented as a Visual Basic 
program. VB supports the creation of programs that 
support COM interfaces (particularly [Dispatch, the 
Application Automation interface). As a result, these 
programs, can be activated remotely using DCOM. We 
implemented (in VB) a form-server object that supports 
Form, Item, Save and Restore methods. The Form and 
Item functions retum object references to a form or to 
any of the controls on that form. Once an object 
reference ts obtained, Obtuse can set or retrieve the 
properties of a form or control. For example, the 
“value” property of a scroll bar is a numeric quantity 
that can be used to get/set the thumb position. 


In the current prototype, a programmer constructs a 
user interface with a VB program called GenForm, 
which ts part of the Obtuse system. When GenForm ts 
executed, it writes a file that contains a text array 
constant that “defines” a form. The array constant ts 
then inserted into an Obtuse program as a “resource”. 
The Restore method causes the VB form-server object 
to display the previously-saved “look”. A VB form is 
encoded/decoded by Save/Restore as a text array. 
Figure 5 illustrates the encoding of the Roving Poller 
form that was displayed in Figure 1. 


(12345, 42.3, 15 24.5535,.1.2145.5: 
16776960, 2, "Roving Poller", 23456, 
14,3, 72, 4, 144, 0, 89, 1, 41, 5, 
-2147483633, 2, "OK", 45678, 12, 3, 0, 
4,88, 0, 125, 5, -2147483633, 2, 
“Enter your comments below:", 

56789, 12, 3, 16, 4, 0, 0, 361, 1, 49, 6, 
ei 


Figure 5. Array Constant fora VB Form 


To save space when creating a new form, the GenForm 
program only saves the differences between a canonical 
set of control values and those specified by the user. 
For example, the “top” and “left” properties are almost 
always changed; the “visible” property is rarely changed. 
Since VB has a large number of properties for each 
control, this convention saves considerable space. 


Obtuse has the unusual property that the mechanisms of 
the language implementation are objects, in fact DCOM 
objects. Remember that a DCOM object can be 
activated on any machine. The Obtuse interpreter is 
only required to run Obtuse code, not remote objects. 
This is one of the main differences with Obliq and other 
distributed application systems, which require an 
instance of their interpreter at each node. 
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A FILE object supports I/O on files and directories 
anywhere in the Internet. Interestingly, DCOM can 
pass a file handle from one machine to another and the 
handle retains its validity. This ts not possible with NT, 
the host operating system. Since activating a FILE 
object is necessary to access files and since DCOM 
implements per machine and per object access controls, 
the user has full control over the safety of the system. 


A MUTEX object ts used to implement critical section 
synchronization for shared variables or resources. The 
supported methods are Enter and Leave. 


The Obtuse object model takes unique advantage of 
DCOM’s capabilities. First, thread and context (a 
program’s global variables) objects can be created on 
any DCOM machine on the Internet so that a thread on 
one machine can opaquely access variables on any other 
machine’s context. Figure 6 lists the attributes of the 
three Obtuse program objects — Thread, Context, and 
Stack. ! 


A context can be shared among any number of threads, 
local or remote. For example, a master debug console 
can easily be constructed to monitor the modification of 
contexts located all over the world. Finally, a thread 
can migrate by simply forking its state to anew machine 
and killing the parent thread. Since a context is one of 
the arguments to the Fork method, the new thread can 
be created with its own copy of the parent’s context or 
it can share the parent’s context. All object references 
are marshaled properly by DCOM on inter-machine 
transfers so that programs remain completely location 
opaque. 


STACK objects are always co-located with their thread 
objects; however, they still are DCOM objects. There ts 
no requirement that execution be “procedure based”’. 
The interpreter can evaluate formulas with only a stack 
and a code string. When a thread is marshaled to be 
transferred to another machine, the stack content, 
including return addresses, is converted to a portable 
format. 


In the COM model, objects are normally created by 
server front-ends called class factories. The separation 
of request and creation on a per-type basis provides a 
way to create many different types of servers for each 
object type. Remember that in COM an object is 
defined by the interfaces that tt supports, not by its data 
structures or algorithms. 


4.1 The thread object 


In the current implementation, a THREAD object 
contains a code string, an [Dispatch object handle to a 
context object, two object handles to a stack object, and 
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type information (used by IDispatch, described later). 
The [Operator interfiace defines methods such as Add 
and Subtract; the [Procedure interface defines methods 
such as Frame and Retum (used for procedure 
call/return). The latter interface may be omitted for 
calculations that do not involve procedure calls. 


li 
INTERFACES 
THREAD | Code string Com.IUnknown 
Context Com. [Dispatch 
Stack IThread 


Type Info 


CONTEXT | Array Variants 
Type Info 
Persist flag 

STACK TopOfStack 

ProcFrameIndex 
FrameTopStack 
Array of Frames 
Array Variants 


Figure 6. Obtuse Program Objects 












Com.TUnknown 
Com.IDispatch 
IObject 


Com.[Unknown 
Com.IDispatch 
IOperator 
[Procedure 









When a thread starts execution, it binds to an [Object 
handle. For efficiency (since a context holds global 
variables), the [Object interface was compiled by the 
IDL compiler to generate proxy stubs. As a result, 
references to a remote context, must pass through a 
proxy DLL, which must be registered at that site. 
Accessing global variables using the [Object interface is 
much faster than using IDispatch. 


Figure 7 lists the [Thread interface, which contains 
methods for creating a thread, marshaling tts state, and 
controlling its execution. Migrating a thread or object 
depends on support for marshaling its state. In theory, 
any system, such as C++ or Java, could support 
migration. 


4.2 The context object 


A context object is a vector of global variables. Since 
all variables in Obtuse are represented using the variant 
data type, a context is just an array of variants. Further, 
since an array of variants is also a vanant type, a context 
is marshaled automatically by DCOM when passed from 
one machine to another. 


METHOD | ARGUMENTS 


Open Code stnng 


USE 
Start a thread with a 
default context 
| Restart a_ thread 
from a saved state 


Code stnng 
Inittal pc value 
Initial context 


OpenEx 
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State to restore 
Suspend flag 
Fork | Thread object Clone the current 


Context object thread and context 
Join 


Timeout value Wait for a thread to 
terminate 


Suspend _ Stop a thread 


Resume 

Sleep _—| Delayvalue | Timed delay 

Context Retrieve context 
handle 


Retrieve stack 
handle 


Figure 7. The IThread Interface 


The JIObject interface, which ts listed in Figure 8, 
describes the methods to store and retrieve Obtuse 
variables. The Get/Put methods access simple variables; 
GetIndex/PutIndex access arrays. Since all Obtuse 
variable locations are the same size, variable addresses 
are just indices (e.g. 0,1,2 etc.). 


Array access 1s implemented by passing the entire 
subscript list as an argument. This approach is more 
efficient for remote access than evaluating one subscript 
at a time. In Obtuse, array assignment ts “by value”. The 
only way to generate a “reference” in Obtuse 1s by 
creating a COM object. 


The context class interface contains a number of helper 
functions (save, restore, persist, sweep) that are 
intended only for local use. The “save” and “restore” 
functions are used to clone a context. The “persist” 
helper function toggles a flag that indicates whether a 
context should be retained after its thread terminates. 
This option 1s useful for debugging and also for writing 
programs that inter-operate by passing contexts back 
and forth. Used in this way, a context is somewhat like 
a COMMON block in FORTRAN. 


METHOD | ARGUMENTS 


Get | Index 


GetIndex Index X[a, b, c] 


Subscript list 


PutIndex Index X[a, b, c] 


Subscript list 
Figure 8. The IObject Interface 


USE 
=O 





The “sweep” helper was introduced after we discovered 
that many of our early programs were leaving objects 
scattered all over the network. The program in Figure 1 





illustrates the problem. A remote context is created and 
then the local context ts “cloned” into it. However, the 
new context now has a reference to itself, since its 
handle was in the original context. As a result of this 
circular reference, DCOM never called the destructor 
for the context when the thread terminated. The 
“sweep” method addresses the problem by clearing all 
object handles in a context when its thread terminates. 


4.3 The stack object 


As mentioned earlier, there 1s a one-to-one relationship 
between a stack and a thread. By design, a stack can 
never contain a reference to itself so circular references 
are prevented. Stacks and threads are always co-located 
for efficiency. In Obtuse, a stack ts a vector of variant 
values and may also have an associated vector of frames 
(if procedures are used by the code fragment). 


This design 1s somewhat unconventional in that most 
systems embed the call chain within the evaluation stack. 
The disadvantage is that the chain typically uses 
pointers, which we avoid by inverting the list. As a 
result, the frame stack is a separate array. When a 
thread migrates, there are no restrictions on its state. It 
can be arbitrarily nested within procedure calls 


Each call frame contains an argument count for the 
procedure, a count of local variables, the evaluation 
stack index for the previous frame, and the index of the 
call point in the thread’s code string. Another 
advantage of inverting the frame stack is that arguments 
and locals are adjacent so indexing Is trivial. 


Figure 9 lists the [Operator and [Procedure interfaces. 
The stack class includes two helper functions: Save and 
Restore. The “Save” procedure produces an array of 
variants that represents the “state” of the stack, 
including procedure nesting. The “Restore” procedure 
retums a stack to a previous state. Since program state 
is a first-class object in Obtuse, it should be possible to 
support fault-tolerant algorithms through various 
checkpointing schemes; however, we have not explored 
this idea further. 


IOperator 


Multiply _ 
Divide 
Compare > 
Mod — 
Invert 





Conference on Object-Oriented Technologies and Systems - June 16-20, 1997 


a 


74 


Variant€constant string 


String variant 


Push | Stack € variant 
Variant € stack | 


[Procedure 
Frame | Call procedure 
Return | Retum from procedure 
Put _| Store local/argument_| 





Figure 9. The IOperator/IProcedure Interfaces 


4.4 Type information 


The power of the Obtuse object model derives from 
DCOM, particularly when combined with [Dispatch. In 
this section, we illustrate how a dynamic procedure call 
can be accomplished. As is illustrated in Figure 1, the 
“object” statement in Obtuse can be used to create an 
object on any machine. In COM, an object handle is 
created by asking if the object supports a particular 
interface. Obtuse always queries for the [Dispatch 
interface. 


Thus, the “me” variable in Figure 1 is a variant record 
with a value that is an object handle of type I[Dispatch. 
The code “me.Fork{a, b}” translates to interpreter byte 
codes that indicate a method invocation. When the 
Obtuse interpreter encounters an “invoke” opcode 
(actually properties work the same way), it first has to 
bind the method name (Fork) to a method id code 
Automation does not support a call by name option. As 
a result, it takes two round-trip calls to the server per 
method call. 


After getting the method id, the interpreter calls the 
“Invoke” method in the [Dispatch interface of the “me” 
object. The arguments to “Invoke” are a vector of 
variants (the argument list) and the method td. The 
retum value from “Invoke” is a variant that encodes the 
result of the “Fork” call. 


The remaining part of the puzzle is a discussion of how 
[Dispatch actually constructs a method call to “Fork” in 
whatever language “Fork” is implemented, and with the 
appropriate calling convention. Figure 10 illustrates the 
solution used in Obtuse. 


static PARAMDATA 
rgpdataCBobContextPersist[] = 


{ 
{ "onOft"’, VT_LONG } 
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} 


static METHODDATA rgmdataCBobContext[] = 
{ // void CBobContext: :Persist(long onOff) 


"Persist", 
rgpdataCBobContextPersist, 
IDMEMBER_CBOBCONTEXT_PERSIST, 
IMETH_CBOBCONTEXT_PERSIST, 
CC_STDCALL, 
DIM(rgpdataCBobContextPersist), 
DISPATCH_METHOD, 
VT_VOID 
3 
} 


static INTERFACEDATA g tdataCBobContext= 


rgmdataCBobContext, 
DIM(rgmdataCBobContext) 


Je 


Figure 10. Encoding an [Dispatch Interface 


The data structure encodes the “type information” 
referred to in Figure 6 for thread and context objects. 
The “interface data” structure contains a count of the 
number of methods, and pointers to method descriptors. 
Each method descriptor contains the name of the 
method, a pointer to a vector of parameter descriptors, a 
method id number, an index into the vTable for the 
class, a count of the number of parameters, a code to 
indicate the calling convention, and the type of the 
retum value. 


Thus, for the “me.Fork{}” example, the arguments 
(encoded as an array of vanants) are converted to the 
parameter types specified, pushed on the stack in a 
calling-convention and language-specific way, then the 
vTable index is used to make the call. The return value 
is removed from the stack (or registers) and is encoded 
as a variant. The Obtuse interpreter then pushes the 
retum value on tts stack and execution continues. 


5. Sample Programs 


We have identified three classes of distributed 
applications that can be programmed using Obtuse — 
Migratory, Synchronized, and Cooperative. A 
migratory application moves object state from one 
machine to another. This may involve moving all of a 
program, or only a part, such as the user interface. 
Implementing routing slips for documents ts an example 
of a migratory application. A synchronized application 
is One in which the actions at one site are duplicated at 
another. For example, a debug console might be 
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synchronized to the program or UI state of a distributed 
application. Finally, a cooperative application requires 
that multiple sites participate to solve problems. 
Decision-making tasks, such as preparing budgets for 
instance, are usually accomplished in a cooperative 
fashion. 


var resource, //set froma VB form 
var a, b, c, where, me, thread, context; 


resource := 
[12345,12,3,-15,4,5535,1,2145,5,16776960,2, 


scroll bars on a second machine. The objective was to 
increase the number (NBARS) of scroll bars, and then 
to observe the impact on responsiveness. 


The program actually does not scale very well, but the 
problem is with the algorithm, not DCOM. Distributing 
a signal to a large number of recipients should not be 


"Roving Poller",23456,14,3,72,4,144,0,89, 1,41,5,-2147483633,2, 


"OK", 45678, 12,3,0,4,88,0,125,5,-2147483633,2, 


"Enter your comments below:",56789, 12,3,16,4,0,0,361,1,49,6, 


ede 

foreach where in argv do 
me := self: 
thread := object("Bob. Thread", where); 


//iterate command line arguments 
//a reference to the running thread 
//create a remote thread 


context := object("Bob.Context", where); //create a remote context 
if me.Fork{thread, context}=1 then //duplicate myself at “where” 


quit; 


//1 returned to parent thread; it quits 


end, //------- child thread starts here 


a := object("Bob.Form", “”); 
b := a.Restore{resource}; 
c :=a.Item{"button0"}; 
loop 
ifctag ="1" then exit; end; 
end; 
resource := a,Save{ }: 
end, 
quit, 


//create a VB form at the child site 
//method call to VB form server 
//get object reference to the button 


//delay until a button click 


//save the current look and content of form 
//loop until all the sites have been visited 


Figure 11. A Migratory Application — Roving Poller 


Figure 11 lists the code for the Roving Poller example 
discussed in Section |. It is an example of a migratory 
application. The program transfers itself to every 
machine in a list, which is specified on the command 
line, so that each user can enter comments in an edit 
control The completed form, together with the 
accumulated comments, is displayed at the last machine 
in the list. A return-to-sender convention could be 
implemented by placing the name of the originating 
machine last in the argument list. 


The second example, which ts listed in Figure 12, ts a 
simple, synchronized application that was written as a 
UI performance test. The tdea was to synchronize a 
scroll bar on one machine with an arbitrary number of 


performed with a simple for loop, but rather with a 
distribution hierarchy. 


The final example, which is listed in Figures 13 and 14, 
implements a cooperative application that supports a 
common decision-making task performed by three co- 
workers, that is, deciding where to go to lunch in a 
timely, and fair, fashion. When the application ts 
initiated, it displays a form containing three edit controls 
at each site. The participants can type in their choice for 
lunch and can observe, but not modify, the other 
choices. Each user can change their mind arbitrarily, but 
at the instant that a majority has agreed on a chotce, 
input is frozen and the consensus ts displayed for all to 
see. 
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var a, b, c, e, old, i, NBARS; 
var aa, CC; 

var resource; 

resource := 


[12345,14,3,6120,4,6600,0,2880, 1,1110,5,8454143,2, 


"Scroll Test",23456,4,5,8454016,89013,8,3, 16,4,16,0, 153]; 


NBARS := 9; 

aa ‘= [0]; 

for i :=0 to NBARS-2 do 

aa :=aa & [0]; 
| end; 

CC := aa; 

a = object("Bob.Form"); 
for :=0 to NBARS-1 do 


//create an array dynamically 
// should probably be a function to do this 


//create the master scroll bar 


aa(i] := object("Bob.Form, "a-bobc-1"); //create N remote scrollbars 


b := aa{i]; 
c := b.Restore{ resource}: 
c :=b.Form{}; 
c.Top := 1%8)* 1000; 
c.Left := (1/8)*4000; 
cc[i] := b.Item{"hscroll0"}: 
end; 
| b :=a.Restore{ resource}; 
c :=a.Item{"hscroll0"}: 
old := c. value; 
loop 
e := c. value; 
if (e > 30000) then exit; end; 
if e '= old then 
for 1:=0 to NBARS-1 do 
b := cc[i]; 
b.value := e; 
end. 
old :=e; 
end: 
end; 
quit; 


//get an object reference to the form 
//place the scroll bars in a grid pattern 


//object reference to each scroll bar 


//exit when thumb moved to far right 
//wait for a state change 
//update all the other scroll bars 


Figure 12. A Synchronized Application — Master/Slave Scroll Bars 


6. Performance Measures 


Obtuse implements late binding of procedure calls and 
property access by using the Automation [Dispatch 
technology. Obtuse marshals program and_ user 
interface state by encoding values in variant arrays. 
There is the question of what penalty is paid for the 
additional complexity. 


We conducted performance tests in order to quantify 
some of the costs. The tests were conducted on two 
166 Mhz Pentiums, which were on the same 10mb 
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Ethernet segment, and which were running Windows 
NT 4.0. Table 1 lists the results of the tests. Each test 
program was run several times to venfy that the results 
were stable and each test loop was repeated 100 to 
10,000 times, depending on the amount of time 
involved. The test programs are listed at the Obtuse 
web site: obtuse.cs.olemiss.edu. 
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Measurements 


Function Software Layers 
Global variable 1:1n-process proc call 
in a Context 2: afray access 


Local Obtuse 0.025 l:in-process proc call 













Figure 13. The Eat-Lunch User Interface 


function call | A variant copies 


Obtuse has no program library to provide timing Local COM object 0.069 
functions so the interpreter was modified such that a 
reference to the first global variable in a context 
returned the current time in milliseconds. Thus, two 
references to the same variable could be used as a timing 


Ainction: Local Visual Basic 1.543 1 :cross-process call 

: : COM object 2:COM runtime 
The test results require some explanation. First, there is | fynction call a TD isaaicherlintine 
a distinct time difference between’ variable | eee 


: 4:VB runtime 
access/function calls and invoking a method on an 


Automation object. The latter operation is slower 
because Obtuse must do a symbol table lookup to bind nas BUR nen OSS prOe ss call 
the method name to a method index. Access to Visual 2:COM runtime 


1:in-process call 


2:COM runtime 
3:IDispatch runtime 


- function call 


Basic properties or methods takes even longer, up to 3:1Dispatch runtime 
1.6ms. Since there is no difference in the Obtuse 4:VB runtime 
runtime code between accessing a VB object and an 
Obtuse object (like Event), we can assume that the VB Local Clone task 4.400 |:variant copies(lots) 
runtime contributes to the factor of 25 performance | (one cycle) | 2:memory allocation 
decrease. 3:new OS thread 
4:kill OS thread 
S:memory free 


(all in process) 





The remote variable access only takes 4ms, versus 11ms 
for the function call, because context object access 1s 
routed though a proxy DLL on each machine whereas 
access to other Obtuse objects, such as EVENT or 





FILE, must use the [Dispatch infrastructure. [Dispatch | Remote global 4.600 | I:cross-machine call 
requires two separate calls to the remote node for each variable access 2:COM runtime 
method call: one call to bind the name and one to make __| (in a Context) 3:DCOM runtime 


the call. 4:net transport 


Remote COM 11,020 ]:cross-machine call 
object function call | 2:COM runtime 


3:DCOM runtime 
4:IDispatch runtime 
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| Local Restore and 


| and Save of VB 


——————$_— -—— 


11.820 


12.000 
43.760 


81.100 
266.400 


The final set of tests involved cloning a task to a target 
machine and then cloning that task back to the source. 
This provides an indication of the costs to migrate a 
process from one machine to another. The migration 
test was performed without also moving a_ user 
interface. The local test (in which the target and source 
were the same machine) took 4.4ms and the remote test 
took 266ms 


1:cross-machine call 


2:COM runtime 
3:DCOM runtime 
4:IDispatch runtime 
S:net transport (2) 
6: VB runtime 


| Remote VB 
| property reference 


Remote Visual ]:cross-machine call 
Basic COM object 


function call 


2:COM runtime 
3:DCOM runtime 
4:IDispatch runtime 
5:net transport (2) 
6:VB runtime 


Save of VB Form 1 :cross-process call 


2:COM runtime 
3:IDispatch runtime 
4:VB runtime 
Remote Restore 


1:cross-machine call 


2:COM runtime 
3:DCOM runtime 
4:IDispatch runtime 
5:net transport (2) 
6:VB runtime 


Form 


Remote Clone 
task and then 
remote clone back 
to the source 


1: variant copies(lots) 


2:COM runtime 
3:DCOM runtime 
4:IDispatch runtime 
5:net transport (2) 
6:new OS thread 
7:kill OS thread 

8: memory free 


(one cycle) 


7. Observations 


The Obtuse system was designed and implemented by 
the author in eight weeks as part of a summer research 
appointment at Microsoft Corporation. As a result, 
both the language and runtime lack a number of features 
that can be found in more mature languages such as 
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C++ or Java. Nevertheless, there are some lessons that 
can be shared based on the experience to date. 


First, migrating a thread or object depends on support 
for marshaling its state. In theory, any system, such as 
C++ or Java, could support migration, and probably 
should. 


The ability to transmit a user interface’s state from one 
machine to another is also a useful capability. Visual 
Basic and its competitors could easily be extended to 


» support persistence. The VB property model even lends 


itself to simple encoding strategies. 


We found that support for marshaling the state of forms 
was essential to the implementation of migratory 
applications. Most Office applications, for example, 
marshal (or save) the current user interface state on exit, 
and restore it on startup. Java has provided entry points 
for marshaling UI state with the Applet methods 
init/destroy and start/stop. JDK 1.1 has additional 
support for serialization. Client-side Java has been 
joined by server-side Java. The next evolution should 
be mobile Java. 


The most difficult implementation problems involved 
reference counting, which resulted in objects that never 
got released. For example, in testing the Roving Poller 
example, it was discovered that context objects were 
being activated, but never deleted, on machines all over 
our building) The problem turned out to be a circular 
reference; that is, the context objects had references to 
themselves. The problem was addressed by checking 
for circular references in both the stack and context 
object of a terminated thread. However, this approach 
would not handle indirect recursion through other 
context objects. 


DCOM provides an excellent infrastructure for writing 
distributed applications. However, its OLE legacy ts the 
source of a number of serious problems. The first, and 
most serious, problem is the OLE registry, which is used 
to store app/server/classId associations and access 
control information. Asa result, Windows NT has two 
object information systems—the file system and the 
registry. | Even worse, the DCOM configuration 
information can only be manipulated by the system 
administrator. Also, the tools that support changes to 
the registry are neither user friendly nor fault tolerant. 
As a result, for multi-user systems, such as all the 
machines in the CS department, no students can create 
objects. This situation must be resolved before DCOM 
can find wide acceptance. 


A less severe problem is the current specification of 
[Dispatch/Variants, which were designed to support 
Visual Basic, and were later modified slightly to support 
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Visual Basic for Applications. The design ts not well- 
suited for supporting distributed applications. The 
concepts are correct, but the requirements have 
changed. As a result, the design needs to be upgraded. 
For example, the Variant value types are not extensible; 
variants only support what was required to implement 
Visual Basic. There 1s no reason to require reference 
counting for value-based types, such as complex 
numbers or x-y coordinates. 


8. Related Work 


Obliq depends on Modula-3 for its runtime support; 
Obtuse depends on COM/DCOM. _Bharat’s_ Visual 
Obliq also supports migrating user interfaces. Obtuse is 
unique in its integration of the COM object model in its 
design and in the support for persistent VB objects. 
Hirano’s [6] HORB system is a Java superset that can 
be used to write distributed applications. Gray[7} has 
implemented a transportable agent system, Agent Tcl, 
and maintains an extensive “related work” site[8]. 


9. Acknowledgements 


At Microsoft, Tony Williams conceived of, and initiated, 
the Obliqd/DCOM summer project, which was supported 
by Nat Brown and Dennis Adler 


10. Availability 


The Obtuse system 1s available for experimentation at 
the web site www. obtuse.cs.olemiss.edu. 


2) 


3) 


4) 


5) 


6) 


7) 


8) 


References 


Cardelli, L., Obligq: A language with distributed 
scope. Report No. 122, Digital Equipment 
Corporation, Systems Research Center, (1994). 


Krishna Bharat and Luca Cardelli, Migratory 
Applications, Proceedings of ACM Symposium on 
User Interface Software and Technology '9S, 
Pittsburgh, PA, (Nov 1995). 
http://www.cc.gatech.edu/gvu/people/Phd/Krishna/ 
VO/Migration.html 


Distributed COM, Microsoft Corporation 
(1996). http://www.microsoft.com/windows/commo 
n/aa2399.htm 


The Component Object Model Specification, 
Microsoft Corporation (1996). 
http://www. microsoft.com/oledev/olecom/title.htm 


Brown, Nat, and Kindel, Charlie. Distributed 
Component Object Model Protocol -- DCOM/1.0, 
(1996). http://ds.internic.net/intemet-drafts/draft- 
brown-dcom-v]-spec-01 .txt. 


Hirano, Satoshi, The HORB System, (1996). 
http://ring. etl. go. jp/openlab/horb/ 


Gray, R.S. et al,. Mobile agents for mobile 
computing. Technical Report PCS-TR96-285, 
Department of Computer Science, Dartmouth 
College, 1996. 


Related Work, 
http://www.cs.dartmouth.edu/~agent/ 


Conference on Object-Oriented Technologies and Systems - June 16-20, 1997 


79 


var resource, a, b, c, d, e, f, g, h, 1; 
resource := 
[ 12345, 12,3,-15,4,5535,1,2085,5,8454016,2, 

! "Where Do We Go For Lunch?",45678, 12,2, 
"Enter your choice below:",4,32,3,0,0,117,5,-2 147483633 ,45679, 14,2, 
"  "'4,234,3,16,0,9,1,25,5,-2147483633,45680, 12,2, 
"" 4,232,3,48,0,3,5,-2147483633,45681,12,2, 
""'4,232,3,80,0,3,5,-2147483633,56789, 12,3, 16,4,0,0,225,1,25,6, 
"" 56790, 12,3,48,4,0,0,225,1,25,6, 
"" 56791, 12,3,80,4,0,0,225, 1,25,6,""]; 
a :=[["daddyo", "rob book"], ["monroe", "salt will"], 

["a-bobc-1", "wispy well"]]; 

g :=[["labell", “textO"], ["label2", "text1"], ["label3", "text2"]]; 
b :=(0,0,0]; 1 :=[0,0,0]; 
for c:=0 to 2 do 


b[c] := object("Bob.Form", a{c,0]); //create the form at each machine 

d :=b{c]; 

e := d.Restore{resource}; 

e :=d.Item{g[c,0]}, //get an object reference to each label control 
e.caption := a[c, 1]; //set the “caption” property to the user’s name 
for h:=0 to 2 do /Nock all the edit fields except the user’s 


f = d.Item{g[h,1]}; 
flocked := h!=c; 


end: 
end; 
loop 
for c:=0 to 2 do 
d := b{c]; 
f :=d Item{g[c,1]}: //retrieve the choices of the other users 
i[c] = fitext; 
for h:=0 to 2 do //update my controls to display their latest 
if h!=c then 
d := bfh]; 


f :-= d.Item{g[c,1]}; 
f'text := i[c]; 
end. 
end; 
end; //as soon as 2-of-3 match, let majority rule 
if (i[O]!="") & (i[0]=i[1]) then i[2] := if1]; h := 2; exit; end; 
if (i[] ]!="") & (if ]=i[2]) then i[O] := i{1]; h := 0; exit; end; 
if ([0]!="") & (i[0]=i[2]) then i[1] := i[0]; h := 1; exit; end; 
end; 
for c:=0 to 2 do //lock every control, display consensus everywhere 
d = b[c]; 
f :=d.Item{g[c,1]}; 
flocked := true; 
fitext := i[c]; 
f :=d.item{g[h, 1]}; 
f.text := ifc]; 
end; 
Figure 14. Where-To-Eat-Lunch Application 
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Abstract 


The Eternal system is a CORBA 2.0-compliant system that 
provides, in addition to the location transparency and the 
interoperability inherent in the CORBA standard, support 
for replicated objects and thus fault tolerance. Eternal 
exploits the Internet Inter-ORB Protocol ({f1OP) interface 
to ‘‘attach’’ itself transparently to objects operating over 
a commercial CORBA Object Request Broker (ORB). The 
Eternal Interceptor captures the ITOP system calls of the 
objects, and the Eternal Replication Manager maps these 
system calls onto a reliable totally ordered multicast group 
communication system. No modification to the internal 
structure of the ORB is necessary, and fault tolerance 
is provided in a manner that is transparent to both the 
application and the ORB. 


1 Introduction 


Distributed systems consist of clusters of computers that are 
capable of both functioning autonomously and cooperating 
harmoniously to achieve a particular task. The integration 
of an object-oriented paradigm with a distributed computing 
platform yields a framework in which objects are distributed 
across the system. Objects invoke other objects, or are 
themselves invoked, to provide services to the application. 
The Object Management Group (OMG) has established 
the Common Object Request Broker Architecture (CORBA) 
[12, 14, 15, 17, 18], which is a standard for communications 
middleware that defines interfaces to distributed objects and 
that provides mechanisms for communicating operations to 
objects by means of messages. The key component of this 
architecture is the Object Request Broker (ORB), which 
handles requests to, and responses from, the objects in the 
distributed system. 
*Research supported in part by DARPA grant NO0174-95-K-0083 and by 


Sun Microsystems and Rockwell Intemational Science Center through the 
State of California MICRO Program grants 96-051 and 96-052. 
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Unfortunately, the current CORBA standard makes no 
provision for fault tolerance, which has led to research 
aimed at making CORBA-based applications reliable. One 
approach has been to build the fault-tolerance capabili- 
ties into the ORB itself [16], as in Electra [8, 9] and in 
Orbix+Isis [5]. Another approach, adopted in the Open- 
DREAMS project [3], advocates that reliability be provided 
as part of the suite of object services available to the 
ORB. While the former approach makes the fault tolerance 
transparent to the application, it also involves considerable 
modification to the CORBA implementation to enable the 
ORB to take advantage of a multicast group communica- 
tion system underneath it. On the other hand, the latter 
approach simply adds an object group service on top of an 
unmodified ORB, and uses no underlying multicast group 
communication system, thereby making the system interop- 
erable and portable, but with the fault tolerance visible to 
the application programmer. 

The Eternal system that we are developing provides fault 
tolerance transparently to the application using CORBA, 
without modification to the ORB. The mechanisms for 
achieving reliability are hidden from the application pro- 
grammmer, and concern only the system developer. The 
Eternal system can utilize any commercial implementation 
of the CORBA 2.0 standard. Although Eternal is layered 
Over a multicast group communication system, the vendor’s 
ORB does not need tobe altered to utilize the fault tolerance 
that Eternal provides. Furthermore, the system is designed 
to enable objects running over different ORBs to interact 
with each other. 

The Eternal system exploits the services provided by the 
Totem multicast group communication system [1, 6, 11] to 
maintain the consistency of the replicas that are employed 
for fault tolerance. However, since Eternal only deals with 
interfaces of objects and of the ORB, any multicast group 
communication system, with an interface and guarantees 
similar to those of Totem, can alternatively be used. 

The structure of the Eternal system is shown in Figure 1. 
In this paper, we focus on the Interceptor, which ‘‘catches’’ 
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Figure |: Structure of the Eternal system. 


the system calls made by the ORB to TCP/IP, and also on the 
relevant part of the Replication Manager, which diverts the 
calls to Totem. In addition, Eternal supports the evolution of 
a system by exploiting the replication of objects to perform 
live upgrades of objects and their interfaces. Resource 
management is also provided for the creation, placement, 
and distribution of objects. 


2 CORBA and the IIOP Interface 


The CORBA standard specifies an interface for each dis- 
tributed object. This interface is written in the declarative 
syntax of the OMG Interface Definition Language (IDL). 
The language-specific implementation of a server object is 
hidden from client objects that require the service provided 
by the server object; the server object can be invoked only 
through its interface. 

Invocations of objects and responses from invoked ob- 
jects are handled through the ORB, which acts as the inter- 
mediary or ‘‘communication bus’’ for all of the interactions 
between the distributed objects in the system. At a client 
object, a stub, generated by the IDL compiler, receives the 
request, marshals the call into the format appropriate to the 
request, and passes it to the ORB. At the server object, 
a language-specific mapping of the IDL specification, a 
skeleton, unmarshals the parameters of the call and per- 
forms any additional processing to invoke the appropriate 
method. The results of the operation are returned to the 
client object via the ORB. 
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CORBA provides location transparency, meaning that 
the client objects convey their requests only to their 
ORBs, which then undertake the task of locating a suit- 
able server object and then dispatching the request to 
it. Thus, a client object need not be aware of the lo- 
cation of a server object since the ORB has access to 
this information. Every CORBA object is identified by 
an object reference, which is assigned to it by the ORB 
at the time the object is created. Client objects asso- 
ciate object references with their requests to enable the 
ORB to route their requests to the appropriate destina- 
tions. 

The interoperability of CORBA arises in the context 
of communication between heterogeneous ORBs. Every 
CORBA 2.0-compliant ORB is equipped with the ability to 
communicate using the Internet Inter-ORB Protocol (IIOP) 
[10, 12], which ensures that objects running over different 
ORBs can interwork when they use the IIOP interface. Only 
the ORB hosting an object needs to know the details of the 
object, while other ORBs that wish to interact with the object 
need only be able to address it. Every object is assigned an 
Interoperable Object Reference (IOR) for this purpose. 

The General Inter-ORB Protocol (GIOP) is a general set 
of specifications that enable the messages of the ORB to be 
mapped onto any connection-oriented medium that meets a 
minimal set of assumptions (reliable, byte stream-oriented, 
loss-of-connection notification). The Internet Inter-Orb 
Protocol (IIOP) is GIOP with the messages transported by 
TCP/IP. By sending IIOP messages over TCP/IP, the ORBs 
can use the Internet as the backbone for their communi- 
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cation. Server objects, that use IIOP to interact with their 
client objects in an environment of heterogeneous ORBs, 
publish their references in the form of IIOP IOR profiles. 
The primary motivation for the use of IIOP is that 
all CORBA 2.0-compliant implementations can use this 
simple generic interface, irrespective of the internal details 
of the vendor’s ORB, and the platform on which the ORB 
operates. A number of commercial ORBs now provide 
IIOP as their native protocol, since an increasing number of 
CORBA applications require interoperability over different 
platforms and the ability to operate over the Internet. 


3 The Eternal System 


The Eternal system is designed to work with any com- 
mercial off-the-shelf CORBA 2.0-compliant ORB with no 
modification whatsoever to the ORB. Moreover, the fault 
tolerance 1s provided in a manner that is transparent to the 
application objects. 

Since the underlying fault tolerance capabilities are hid- 
den from the application, the application programmer does 
not need to worry about the difficult issues of asynchrony, 
replica consistency, concurrency, and the handling of faults. 
The Eternal system replicates and distributes the application 
objects across the system, and allows the programmers to 
write the application as if it were a sequential program to be 
run on a single machine. 

Fault tolerance is provided by replication [7] of both 
client and server objects across the distributed system. As 
shown in Figure 1, Eternal exploits the reliable totally 
ordered message delivery of the underlying Totem system 
to ensure replica consistency in the presence of faults. In 
addition, mechanisms are provided to detect and suppress 
duplicate operations and to support nested operations [13]. 


4 Group Communication Models 
4.1 Process Groups 


An increasing number of distributed applications are struc- 
tured as collections of processes that interact or cooperate to 
accomplish a particular task. Such a collection of processes 
is called a process group and can be considered abstractly 
as a single unit, as shown in Figure 2. A process group 
may reside entirely within a single processor, or may span 
several processors. 

A process group 1s characterized by its membership, and 
processes can be added and removed from the process group 
by the execution of a group membership protocol. A process 
is permitted to be amember of more than one process group, 
thereby resulting in intersecting process groups. 
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The services of a process group can be invoked trans- 
parently, with no knowledge of its exact membership or the 
location of its member processes. Thus, a process in the 
system can address all of the members of a process group 
(including its own) as a whole, using a multicast group 
communication system, such as Totem. A process can send 
messages to one or more process groups, of which it may or 
may not be a member. These messages are totally ordered 
within and across all receiving process groups. 

The Totem system provides reliable totally ordered mul- 
ticasting of messages to processes in process groups. Each 
message is assigned a unique timestamp, and these times- 
tamps establish the total order of delivery of messages to the 
application. For messages multicast and delivered within 
the same configuration of processors, Totem provides these 
message delivery guarantees despite communication and 
processor faults, message loss, and network partitioning. 
The process group layer takes advantage of these services 
and guarantees of the underlying Totem protocols to provide 
reliable totally ordered multicasts within and across process 
groups. 


4.2 Object Groups 


Analogous to the notion of a process group, an object group 
is a collection of objects that cooperate to provide some 
useful service, as shown in Figure 3. This abstraction 
enables a client object to invoke the services of a server 
object group transparently, as if it were a single object. The 
server object group can also return the results to a client 
object group transparently, as if it were a single object. 

An object group may consist of similar or dissimilar 
objects. In the Eternal system, a replicated object is rep- 
resented by an object group, the members of which are 
identical and are the replicas of the object. Both client and 
server objects can be replicated and thus can be represented 
as object groups. The reliable totally ordered multicasts of 
Totem are used to communicate the invocations to, and the 
responses from, the object group. The replicas of an object 
receive the same operations in the same order, thereby en- 
suring consistency of the states of the object replicas. The 
exact location of the replicas of the object, the degree of 
replication, and the type of replication (active or passive) 
is transparent to an object that invokes the services of a 
replicated object. 


5 The Eternal Interceptor 


The Eternal Interceptor is a user-level layer between the 
ORB and the operating system. The principle underlying 
the design of the Interceptor is that the functionality of an 
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Figure 2: Process groups in Totem. 


operating system can be extended at the user level, without 
requiring modifications to the kernel or to the standard 
system libraries. One way of doing this is by intercepting 
system calls from specified processes before these calls 
reach the kernel, and then modifying these system calls to 
implement the desired functionality. The mechanisms are 
entirely transparent to an application process whose system 
calls are intercepted. 

Such an approach is useful for the development of global 
file systems [2] and for the testing of kernel extensions. 
The Eternal system employs the same approach to *‘attach’”’ 
itself transparently, via the Eternal Interceptor, to all objects 
that operate over a CORBA 2.0-compliant ORB. 


5.1 Intercepting System Calls 


The Eternal system can ‘‘attach’’ itself to any CORBA 
object and can ‘‘catch’’ a specified set of system calls 
that are made by the ORB during the object’s interactions 
with the system. To do this, the Interceptor, given the 
process identifier pid assigned by Unix to the object being 
intercepted, locates and performs a continual trace on the 
file /proc/pid, which is part of the /proc file system of the 
Unix system. 

The system calls to be captured at either the entry to, or 
the exit from, the system call can be specified aprzorz. In 
the normal course of events, these system calls would reach 
the kernel and be executed. However, in Eternal, the tracing 
facilities provided under the /proc interface are exploited 
to enable the specified system calls to be intercepted before 
they reach the kernel. The arguments, and possibly the return 
values, of these system calls can be extracted and examined, 
and the system calls can themselves be modified before 
they are forwarded to the operating system. Furthermore, 
all of these mechanisms can be implemented without the 
intercepted object being aware of their existence. 
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Figure 3: Object groups in Eternal. 


The obvious advantage of such an approach 1s that the op- 
eration of the Interceptor is transparent to both the CORBA 
objects and the ORB itself. This functionality can be im- 
plemented entirely at the user level, with no modification to 
the operating system. The application objects and the ORB 
need not be recompiled to take advantage of the intercepting 
capability. Once the Interceptor is started, it waits to receive 
a message from any newly created CORBA object. As a 
part of its initialization phase, every object supplies its Unix 
process identifier pid to the Eternal system. The Interceptor 
then monitors /proc/pid for the entire lifetime of the object. 

A typical CORBA object invokes many system calls 
during its lifetime. These include calls for memory allo- 
cation, runtime library access, file operations and network 
operations. While some of these calls may be local to the 
machine, any system call that constitutes communication 
with another object, whether local or remote, must take 
place using the ORB. The system calls of interest are those 
that are used by the objects to communicate over IIOP. 

All CORBA objects in Eternal use the IIOP interface. The 
IIOP interface is a simple generic interface to TCP/IP, which 
makes capture of its calls easy. Since we are only inter- 
ested in system calls that are IIOP-specific (Ccommunication- 
specific) and not object-specific, the system calls that need 
to be intercepted are the same for all of the objects operat- 
ing over the CORBA ORB. Since interception of the calls 
is transparent to the ORB, any off-the-shelf commercial 
CORBA ORB, that is capable of communicating over IIOP, 
can be used unmodified. 

Once the system calls of IIOP are intercepted by Eternal, 
the relevant arguments are extracted from the system calls 
and passed to the process group layer for communication 
over Totem. However, the ORB is unaware that its messages 
are delivered by Totem, since it “*believes’’ that it is using 
only the IIOP interface, the calls of which were originally 
intended for TCP/IP. 
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Figure 4; Mapping the IIOP interface onto the process group interface. 


5.2 ILOP-Specific System Calls 


Each time a server object publishes its identity or each time 
an object interacts with any other object in the system, the 
IIOP interface is used. If the ORB’s native protocol is IIOP 
itself, this use 1s unnecessary. 


5.2.1 open() System Call 


Since the IIOP interface uses TCP/IP, the open() system call 
to TCP/IP is among those intercepted. The file descriptor 
returned from this call is recorded so that it can be monitored 
for activity by the system. There are two cases in which an 
open() call may be invoked. In the first case, a connection 
over TCP/IP is established by a server object that publishes 
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its Interoperable Object Reference (IOR) across the network 
and “‘listens’’ for any client object that requires its services. 
The second case occurs when aclient object requests service 
from a server object and the two objects establish separate 
connections to TCP/IP in order to communicate. 


Thus, each server object has a principal TCP/IP con- 
nection, on which it ‘‘listens’’ for clients, and establishes 
additional TCP/IP connections when the client objects de- 
sire to communicate with the server object. The additional 
TCP/IP connections are typically open for the lifetime of 
the client objects, while the principal ‘‘listening’’ TCP/IP 
connection is open for the lifetime of the server object. 


The first open() call to TCP/IP, in turn, triggers the 
Replication Manager, via the Interceptor, to establish a 
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“System Calls of the IIOP Interface 
open(fd) 
close(fd) 


read(fd, <buffer to read data into>) 
write(fd, <data>) 
poll(<list of fds >) 


Routines of the Process Group Interface 
Process_Connect(pgid), Process_Join(pgid) 
Process_Leave(pgid) 

Process_Receive(<buffer to receive data>) 
Process_Send(pgid, <data>) 

Process_Poll(<socket to the process group contoller>) 





Figure 5: Correspondence between the IIOP system catis and the process group layer routines. Here, fd refers to the file 
descriptor retumed from opening /dev/tcp, and pgid refers to the process group identifier. Only the arguments that are 


relevant to the mapping are shown. 


connection with the process group interface of a reliable 
group communication system, such as Totem, in anticipation 
of any communication that might follow. Thus, a given 
object, via the file descriptor associated with this first 
open() call, is associated with a particular connection to the 
Totem system interface. Subsequent open() calls to TCP/IP, 
which represent client-server communication, are recorded 
by means of their file descriptors, which are then monitored 
for any activity. All client-server interactions on these 
file descriptors can be channelled through the respective 
connections of the client and the server to Totem. 


§.2.2 poll() System Call 


A server object, on establishing its principal ‘‘listening’’ 
connection, polls its associated TCP/IP file descriptor, and 
blocks till it hears from a client object that requires its 
services. A poll() call may also be executed in the middle 
of a series of client-server interactions, when either object 
is waiting in anticipation of communication from the object 
at the other end of the TCP/IP connection. It is also possible 
for an object to poll several file descriptors simultaneously 
for activity. 


5.2.3. read() and write() System Calls 


Typical communication between objects in the Eternal sys- 
tem consists of a sequence of read() and write() system 
calls that operate over IIOP. For each object, these system 
calls are associated with a file descriptor on which they 
are invoked. The Interceptor records and monitors all of 
the active file descriptors associated with an object, and 
the Replication Manager maps these file descriptors onto 
the underlying multicast group communication system, in 
our case Totem. Thus, any system call that uses one of 
these file descriptors can be mapped to the Totem system 
interface. 

The read() and write() system calls are associated with 
receive and send buffers, respectively, that store the infor- 
mation that is received or is to be sent. The contents of these 
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buffers represent the user-level abstractions of the messages 
that are communicated between client and server objects. 


The read() and write() system calls used by IIOP contain, 
in the first few bytes, the GIOP header. The IIOP read()s and 
write()s are distinguished from other read()s and write()s by 
the first four bytes of the data, which represent the magic 
field of the GIOP header, as shown in Figure 6. This field, 
along with the list of file descriptors associated with the 
TCP/IP connections, helps in discarding any read()s and 
write()s that might not require the IJOP interface and, thus, 
are not of interest. 


§.2.4  close() System Call 


Since the open() system call is intercepted, the close() 
system call for each associated file descriptor must also be 
intercepted. This call is typically invoked when a client 
object wishes to close a TCP/IP connection once it has 
completed communication with a server object. It can 
also be used to ‘‘tear down’’ the principal connection of a 
server object, thereby removing it from the CORBA object 
space. 


The close() system call, like the open() system call, must 
be handled at both server and client objects. At both client 
and server objects, Eternal deletes any reference to the file 
descriptor associated with the connection (that is now being 
closed). Thus, if the object reuses the same file descriptor 
for future connections, a new association will be registered. 


struct MessageHeader { 
char magic[4}; 
Version GIOP-Version; 


boolean byte_order; 
octet messagetype; 
unsigned long message.size; 


}; 





Figure 6: Structure of the header of a GIOP message. 
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5.3 The Process Group Interface 


The intercepted read(), write(), and poll() system calls are 
also mapped onto their corresponding calls in the Totem 
system interface. It is crucial that the underlying multicast 
group communication system, in our case Totem, possesses 
an interface to which the Interceptor and the Replication 
Manager can map these intercepted system calls. 

In order that the services be provided to the application 
transparently, the group communication system must pro- 
vide a simple interface to enable the objects that constitute 
the application to invoke its services. The interface that 
Totem provides to an application above it is designed to 
hide the implementation details of the underlying protocols 
by presenting only a small number of essential primitives 
that the application needs to use. The interface is intended 
to be simple and elegant and yet to allow the application to 
exploit fully the process group mechanisms of Totem. 

On each processor, a process group controller manages 
all of the process groups on that machine. For each process 
group on that processor, the process group controller main- 
tains information about the member processes (both local 
and remote) and provides membership services for joining 
the group, leaving the group and updating the membership. 
It also maintains a list of the process groups hosted by the 
machine. 

To establish a connection with the process group con- 
troller of Totem, the application calls the Process_Connect() 
routine, supplying the identifier of the process group to 
which the application process wishes to connect. If the pro- 
cess group does not exist, the process connects to a process 
group of which it is the only member. The routine returns 
the identifier of the communication socket that connects the 
process to the process group controller. 

To join a process group with which it has established a 
connection, a process calls the Process_Join() routine, sup- 
plying the identifier of the process group that it wishes to 
join, as well as the identifier of the socket between the pro- 
cess and the process group controller. The Process_Leave() 
routine, which takes the same arguments, initiates the re- 
moval of a process from the specified process group. 

A process can send messages to a process group using 
the Process.Send() routine with the receiving process group 
identifier and the message to be sent as arguments. A 
process can receive messages from another process using 
the Process_Receive() routine with the receiving buffer as 
an argument. The received message is disassembled and 
the identifier of the sending process is extracted from the 
message header, along with information about the process 
groups to which the message is addressed. 

The socket between the application process and the 
process group controller can be polled for any messages 
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while Interceptor is running do 
listen for any newly created CORBA objects 
for each CORBA object created do 
obtain process identifier ped and interface name of the object 
obtain object group identifier ogzd from the Replication Manager 
specify the system calls (those used by ITOP) to intercept 
while the object is operational do 
wait to intercept the specified system calls when they occur 
case <system call intercepted> 
open(): 
if first open()on /dev/tcp then 
record this as the primary file descriptor for this ogid 
invoke Replication Manager to handle this systemcall 
endif 
if subsequent open() on /dev/tcp then 
add file descriptor to the list of descriptors for this ogzd 
obtain ogid of the object at the other end of the connection 
endif 
close(): 
if server and close() on the primary file descriptor then 
invoke Replication Manager to handle this system call 
endif 
if client and close() on the last open file descriptor then 
invoke Replication Manager to handle this system call 
endif 
poll() : 
if poll() on the previously recorded file descriptor then 
invoke Replication Manager to handle this system call 
endif 
read(): 
if read() on the previously recorded file descriptor then 
invoke Replication Managerto handle this system call 
endif 
write() : 
if wrife() on the previously recorded file descriptor then 
invoke Replication Manager to handle this system call 
endif 
endcase 
resume the operation of the object 
endwhile 
endfor 
endwhile 


Figure 7: Algorithm executed by the Interceptor. 


pending delivery, using the Process_Poll() routine. Routines 
are also supplied to close the communication socket, once 
the process disconnects from the process group controller. 


The calls on the IIOP interface are mapped, through the 
Interceptor and the Replication Manager, to the process 
group interface of the Totem system. The implementation 
of Eternal makes it possible to use any multicast group 
communication system, as long as it provides the same fault 
tolerance guarantees and a similar process group interface 
as Totem. The set of routines that the Totem process group 
interface uses facilitates the mapping of the IIOP calls onto 
Totem. The routines can be provided as a library that is 
used by the Interceptor and the Replication Manager. 
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obtain interface name of object from the Interceptor 
look up the table of mappings of interfaces to object groups 
if interface name present in the table then 

extract the object’s object group identifier ogid 


else 
assign a unique object group identifier ogid 
record the interface name and its ogid in the table 
endif 
communicate the ogid for this interface name to the Interceptor 


Figure 8; Algorithm executed by the Replication Manager 
to assign a unique process (object) group identifier for cach 
object. 


5.4 Mapping IIOP to Totem 


The system calls of the CORBA objects communicating 
over the IIOP interface are analogous to the routines that the 
process group interface presents to an application process. 
In this context, an application process corresponds to a 
CORBA object in the system, and the process group identi- 
fiers of Totem correspond to the object group identifiers of 
Eternal. 

The open() system call to TCP/IP corresponds to the 
Process.Connect() routine of the process group interface, 
since both the call and the routine are involved with the 
establishment of connections. The assignment of the object 
(process) group identifier is handled by the Replication 
Manager, as discussed in Section 6. 

The close() system call on an open file descriptor cor- 
responds to the Process.Leave() routine only if the file 
descriptor involved is the principal one, since in this case 
both the call and the routine correspond to the ‘“‘tearing 
down’’ of established connections. If the close() system call 
is invoked on any file descriptor other than the principal one 
for a server object or the last open file descriptor for a client 
object, the close() call simply causes the Replication Man- 
ager to remove any association of the file descriptor with 
the object group identifier. The Process_Leave() routine 
effectively disconnects the process from the process group 
controller and is, thus, invoked only when the object is to 
be destroyed or removed from the object space. 

The read(), write(), and poll() system calls find their 
counterparts in the Process_Receive(), Process.Send(), and 
Process_Poll() routines of the process group interface. The 
send and receive buffers of the application objects contain 
the information that is sent or received over Totem via 
the process group interface. However, the captured read() 
and write() system calls cannot be mapped directly onto 
the routines of the Totem process group layer since the 
Interceptor must first associate the file descriptors in the 
system calls with the process group identifiers in the process 
group interface routines. 
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There may be several CORBA objects to which the 
Interceptor “‘attaches’’ itself. Each such object may service 
multiple requests at the same time, which means multiple 
connections must be managed for each object. However, 
for the purposes of replication, an object is associated with 
only one object (process) group, all the members of which 
are identical. Thus, each replica of a replicated object is a 
member of an object (process) group with a unique object 
(process) group identifier. 

The functionality of the Interceptor is implemented using 
the algorithm shown in Figure 7. The Interceptor does not 
handle all aspects of the object group mechanisms; it utilizes 
the services of the Replication Manager for this purpose. 


6 The Eternal Replication Manager 
6.1 Assignment of Object Group Identifiers 


The object groups that are used for replication are handled 
by the Replication Manager. At creation time, when an 
object informs the Interceptor of its Unix process identifier, 
it also conveys the name of its interface. The Interceptor 
hands this information over to the Replication Manager, 
which associates a unique object (process) group identifier 
for each interface name. Since the interface name, rather 
than any ORB-specific name, is used for this association, 
objects implementing the same interface, but operating over 
different ORBs, can be members of the same object group 
and are treated as replicas from this viewpoint. 

The Replication Manager maintains a globally accessi- 
ble table of the mapping between object (process) group 
identifiers and interface names. Each time an object is 
created, if it is the first replica of the object in the system, 
an entry is created in this table for the object’s interface 
and a unique object (process) group identifier is assigned 
to it. When further replicas of the object are created and 
distributed across the system, this table is referenced to 
ensure that all of the replicas of the object are assigned to 
the same object group. The object group identifier is as- 
signed or discovered by the Replication Manager, on behalf 
of the object, using the algorithm shown in Figure 8. In 
complete implementations of CORBA, the Interface Repos- 
itory, which stores the interface definitions, can be used 
to register the object group identifier associated with each 
interface name. 

When a server replica opens its ‘‘listening’’ connection, 
it discovers its object group identifier and joins its object 
group. When aclient replica wishes to establish a connection 
to this server object, it must first discover its own object 
group identifier and join its object group. The client replica 
then must discover the object group identifier of the object 
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while Replication Manager is running do 
obtain an intercepted system call of the object with the arguments 
case <system call intercepted> 

open(): 
execute Process.Connect() using the ob ject’s pgid 
execute Process_/oin() using the object's pgid 

close(): 
execute Process_Leave() using the object’s pgid 


poll(): 
execute ProcessPoll() using the object's pgid 
read(): 
extract the data part of the system call 
execute Process_Receive() 
write() : 
extract the data part of the system call 
obtain the receiver’s pgid using the file descriptor in the call 
execute Process.Send() using the receiver's pgid 
endcase 
endwhile 


Figure 9: Algorithm executed by the Replication Manager 
to communicate messages over the process group layer. 


at the other end of the connection (the server object, in this 
case), and record the association between the connection 
file descriptor and the server object group identifier. 

Once this association 1s registered, the read(), write(), and 
poll() calls made by the client object on the file descriptor 
can be intercepted and subsequently mapped appropriately 
by the Replication Manager to the known server object 
(process) group identifier, as shown in Figure 9. 

At the server object, the Replication Manager extracts 
the client’s object group identifier from the information 
packed by Totem into the client object requests that arrive. 
The Replication Manager then associates this information 
with the file descriptor of the TCP/IP connection established 
by the server object to communicate with the client object. 
Thus, intercepted system calls on the file descriptor at the 
server object can also be similarly mapped to the appropriate 
client object group identifier. 


6.2 Detection of Duplicate Operations 


In the course of their interactions with other objects, the 
replicas of an object may give rise to duplicate invocations 
and responses. These must be suppressed at the sender or 
the receiver since duplicate operations on an object can po- 
tentially corrupt its state. The Eternal system accomplishes 
the detection and suppression of duplicate operations by 
means of operation identifiers [13], which are assigned by 
the Replication Manager. 

When a replicated object transmits requests or responses, 
the Replication Manager ensures that the object’s own object 
group identifier is included in the list of object groups that 
are to receive the request message. This does not imply, 
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however, that the replicas in the object’s own object group 
consider the incoming request message as an operation to 
be performed. The ‘‘loopback’’ mechanism serves only to 
notify the object’s own object group of the transmission of 
the request. 

Thus, every replica of the object that receives messages 
containing the invocations or responses of another replica in 
the same object group can suppress its own invocations or 
responses. The Replication Manager detects these duplicate 
operations by extracting the operation identifier from the 
messages that it receives from the process group layer, 
and then comparing the identifier with those it has already 
received. If the Replication Manager has already received a 
message containing this invocation or response, it discards 
the message, thereby preventing it from reaching the object 
and corrupting its state. 


6.3 Replication Schemes 


Eternal is equipped to handle both active and passive repli- 
cation in a manner that is transparent to the ORB, as well as 
to each replicated object. Active replication, in which each 
operation is performed by every replica of the object, re- 
quires the detection and suppression of duplicate operations, 
as well as the use of the object group mechanisms. 

Passive replication, in which only a designated primary 
replica performs each operation, requires additional mech- 
anisms to ensure consistency of the states of the replicas. 
The Replication Manager performs a state transfer from the 
primary replica to the secondary replicas at the end of each 
operation. Thus, after the primary replica completes each 
operation, the Replication Manager of the primary replica 
multicasts the primary replica’s updated state to the object 
group containing the primary replica. 


7 Conclusion 


The Eternal system is a CORBA 2.0-compliant system that 
enhances CORBA by providing replication, and thus fault 
tolerance, in a manner that is transparent to the application 
and to the ORB. The ORB can employ these replication 
mechanisms without having to undergo any modification to 
its internal structure. 

We are currently implementing the Eternal system using 
various commercial implementations of CORBA, including 
the CORBA-compliant Inter-Language Unification (ILU) 
[4] from the Xerox Palo Alto Research Center. The tech- 
niques used in Eternal are completely generic and can 
interwork with any commercial CORBA implementation 
that is capable of communication over IIOP. 

The replication of objects finds its use not only in achiev- 
ing fault tolerance, but also in allowing system hardware 
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and software to be replaced transparently, thereby permit- 
ting the evolution of a system with no interruption of service 
to the application. In addition to the Replication Manager, 
the Eternal system provides a Resource Manager and an 
Evolution Manager that handle these challenging issues. 
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Abstract. Gold Rush is middleware supporting the 
writing of Java applications that reside on an intermit- 
tently connected mobile client device and access an 
enterprise database on a central server. While the client 
is connected to the central server, objects constructed 
from database entities can be cached in a persistent 
store on the client. While the client is disconnected, 
these entities can be manipulated within transactions 
that are logged on the client. Upon reconnection, the 
client application can replay these logged transactions 
to the server, modifying the database. A replayed trans- 
action is checked for conflicts with other database 
updates that have occurred since the client obtained the 
input data for the transaction, and the client is notified 
when such a conflict arises. Communication between 
the client and the server is optimized to economize the 
use of a slow or expensive connection such as a radio 
link. 


Introduction 


Continued rapid advances in mobile computing are 
allowing computing and communication technologies to 
be applied where they are most effective and produc- 
tive. For a traveling salesperson, this is typically the 
customer’s place of business. For a health-care worker, 
it is the point of care. 

Disconnecting the client platform from a high- 
speed network, to provide workers with access to enter- 
prise data when and where they need it, presents many 
difficulties. One such difficulty is the limited capability 
of the mobile worker’s platform (e.g. processor speed, 
disk capacity, battery life). Another is the nature of the 
mobile communications link (e.g. low bandwidth, high 
cost, frequent disconnections). 

Application connectivity requirements for data 
access vary widely. Some applications require high 
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degrees of connectivity even while mobile. For 
example, a stock trader requires essentially instantane- 
ous access to the constantly changing prices of stocks 
while the market is open, and the ability to execute a 
trade quickly. This requires the trader to pay the high 
cost of wireless communication, or to be tethered to a 
phone line while out of the office. 

In contrast, a financial planner can perform most 
tasks while disconnected from the network. The finan- 
cial planner might begin the day by connecting to a 
central server and downloading data, such as current 
market conditions and the portfolios of customers to be 
visited, from the enterprise financial database to the 
client device. While visiting with a customer, discon- 
nected from the network, the financial) planner might 
run a Java application on the device to explore various 
“what-if” scenarios. If it becomes necessary to obtain 
additional data that was not previously downloaded, the 
planner could connect to the central server with a brief 
phone call, download the missing data to the client 
machine, disconnect again, and resume the use of the 
program. If the client decides to change the portfolio, 
the financial planner could execute a local transaction 
ordering the change. The planner could connect to the 
central server immediately to replay the local transac- 
tion on the central server, or connect at the end of the 
day to replay all the day's local transactions at once. 

In this paper, we describe Gold Rush, middleware 
that provides lightweight, platform-independent mobile 
clients with object-oriented, transaction-based access to 
enterprise information over a weakly connected or 
primarily disconnected link. Gold Rush includes a 
client-side persistent store, an object-replication layer to 
track and minimize data traffic, and an intelligent trans- 
action replay engine. The middleware helps facilitate 
the development of mobile applications. These are 
applications that, like the financial planning application, 


Conference on Object-Oriented Technologies and Systems - June 16-20, 1997 


91 


92 


enable off-line, occasionally-connected workers to 
execute transactions on enterprise data. Gold Rush 
enables the financial planner to replicate part of his 
financial and customer databases, to execute off-line 
transactions and log them on the disconnected client 
device, and finally to transmit the logged transactions 
back to the enterprise’s financial and customer 
databases, checking for conflicting updates. 

Gold Rush is not appropriate for every mobile 
application. Gold Rush is most useful for applications 
that execute in an environment where client systems are 
disconnected most of the time. For example, Gold Rush 
was not designed to address the needs of the stock 
trader who needs continual real-time access to database 
information. While the device used by the stock trader 
might in fact be mobile (more precisely, untethered), 
from the point of view of network connectivity the 
device is always connected. There are interesting 
problems that must be solved to keep a wireless 
communication protocol operational in this environ- 
ment, but those problems are beyond the scope of the 
Gold Rush project. In general, applications where the 
central database changes rapidly, and where the latest 
version is always needed for a transaction, are not 
implementable in an_ occasionally connected 
environment. 

The suitability of Gold Rush for a given application 
also depends on the application’s frequency of conflict. 
If the application naturally exhibits a low degree of 
conflict, then it is well suited for the Gold Rush 
environment, which will allow for conflict resolution. 
However, if the application typically generates a large 
amount of conflict, then the application ought to be 
redesigned, or executed in a connected environment. 
For example, a set of transactions that always updates a 
shared counter cannot run in a disconnected environ- 
ment without conflicting at every transaction replay. A 
solution in this case is to avoid updating the shared 
counter at the disconnected client, and to do so later 
when the transaction is being replayed at the server. 
That is, the client transaction can be redesigned to 
specify the amount by which the count should be 
increased or decreased rather than the value by which 
the count should be replaced. Given this redesign, the 
application cannot make use of the actual value of the 
shared counter. However, if the application cannot be 
redesigned to avoid using the current value of the 
shared counter, then the application should connect to 
commit each transaction. 


Off-Line Transaction Requirements 


Mobile client applications require access to enter- 
prise data. It is not practical to rely on a constant 
wireless connection to a server for this access, because 
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radio devices drain batteries quickly and because 
any-time, any-place wireless links are expensive. 
Therefore, we replicate enterprise data from the 
database server on mobile client devices. 

The problem is how to write business applications 
to run seamlessly in both connected and disconnected 
off-line modes. Will different applications be needed 
for off-line mode, or can the same application be used 
with some features disabled? What requirements are 
imposed by a mobile application, beyond those of a 
classical client/server connected-mode application? 
Which data should be replicated, how should it be repli- 
cated, and how should the replicated data on the client 
be synchronized with the data on the main server? How 
might conflicts arise, and how should they be resolved? 
To answer these questions, we must examine the 
characteristics of both business data and off-line 
business applications. 

Off-line business applications are structured into 
transactions to guarantee atomicity, concurrency 
control, and durability of the data. A transaction repre- 
sents a set of read/write operations upon business data. 
Transactional access enforces integrity constraints: 
Noncompliant transactions are aborted and accepted 
transactions are committed into the accumulated state of 
the database. An off-line transaction commits data 
into a client’s local store while the client is discon- 
nected. The commit is replayed to the master database 
when the client is reconnected. This approach of a 
lazy commit, consisting of two stages, is necessitated by 
the occasionally connected nature of mobile applica- 
tions. In an analytical paper [GH96], Gray et al. 
compare several transaction propagation strategies and 
conclude that the lazy-commit approach is the one that 
scales well and fits into mobile environments. 

Conflicts may arise during reintegration with the 
main database. For example, if multiple clients change 
the same field while they are disconnected, the conflict 
can only be detected during reintegration. The previ- 
ously committed data of the disconnected clients might 
then be reconciled using a predetermined formula based 
on the context of the transaction or the data itself. 

The flow of the off-line transaction model of appli- 
cation development can be summarized as follows: 

1. Check out: partial replication of business data 
from the business server together with its integ- 
rity constraints 

2. Access: off-line transactional access with all the 
read/write information logged 

3. Check in: reintegration of off-line transactional 
data with main database 

4. Conflict handling: detection and resolution of 
conflicts with predefined formulas or repair 
utilities 
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Transactions provide off-line, occasionally 
connected business applications with a design model for 
data integrity and conflict prevention. Without such a 
model, we would not be able to determine the scope of 
conflict and to make proper repairs to reintegrate 
off-line data with the master database. Our model of 
off-line transactions is similar to the transaction model 
of network-partitioned databases, but less stringent on 
the client side, because the mobile client is a single- 
user, one-application-at-a-time system, and playing a 
second class role relative to the master database. 


Related Approaches 


Existing Java remote-database-access products 
based on the JDBC API [Ja97] are designed for perma- 
nently connected clients. These include IBM’s 
VisualAge for Java [IB97] and Symantec’s dbANY- 
WHERE [Sy97]. In contrast, Gold Rush supports an 
occasionally connected client. JDBC provides access to 
data in terms of relational-database tuples. In contrast, 
Gold Rush supports manipulation of Java objects. 

Several methods have been proposed to allow 
mobile workers to access information from central data 
bases. The methods are as varied as the type of infor- 
mation: files, relational data bases, web pages, etc. For 
mobile file access, the Coda remote file system of CMU 
pioneered the notion of disconnected operations [KI92], 
based on file-level transactions isolating groups of 
changes to files by an application [LS94]. For mobile 
access to documents, Lotus Notes [L096] handles 
two-way data replication, allowing document-level and 
field-level propagation, but without grouping changes 
into transactions. Several approaches have been 
suggested for access to database information. Of great- 
est interest to us are those methods which not only make 
the information available for reading, but also allow 
changes to be written. 

An alternative approach is to download a portion of 
the central database to a private database on the mobile 
client. The smaller database is accessible through tradi- 
tional interfaces, such as ODBC or JDBC, residing on 
the client. The application makes all modifications to 
the smaller database. When the mobile client is able to 
communicate again with the central database, the 
changes made to both databases are reconciled. The 
reconciliation is carried out by software usually called a 
replicator. 

If the replicator detects a conflict during reconcilia- 
tion, it acts according to its configuration. Typically, 
replicators can be instructed to carry out some default 
action in case of conflict—for example, merging the 
changes if possible, or discarding the tuple with the 
older timestamp-—and also allow for  custom- 
programmed actions. 


Replicators are often tightly coupled with the 
implementations of the database systems both in the 
mobile client and the centralized system. In addition to 
a complete replicator, that is, one that can incorporate 
changes from both sides, this approach requires the 
availability of a suitable small server that can host the 
smaller database on the mobile client. 

This design, and the suitability of the replication 
and reconciliation mechanism in mobile or other 
environments, have been studied in depth. [GH96] 
points outs the instability of some replication methods 
and proposes algorithms that alleviate this problem, 
[Fr96] addresses scalability and availability issues; 
[RZ96] discusses the effect on transactions on discon- 
nected operation, and proposes a transaction manage- 
ment model; [YT96] proposes optimistic concurrency 
control, and addresses migration and replication 
methods; [Pi96] presents a method for replication in the 
presence of challenging connections; [ZF96] proposes 
another replication method and analyzes its perform- 
ance; [Wo95] evaluates yet another strategy using an 
application for travel agents; [YW94] describes an 
algorithm for dynamic allocation of replicas; [AN93] 
discusses replication organization, and reconciliation 
methods. Finally, major database vendors offer 
database-access products for mobile workers based on 
this approach ([Sy96], [Or95], [IB95]). 

A replicated database burdens the mobile client 
with a database server and with logic to access the data 
stored on this server. Recently, the increasing popular- 
ity of Internet and intranet applications has made light- 
weight clients desirable. Rather than placing a database 
server on the mobile client, a three-tier architecture with 
mobile transaction middleware gives the client access 
to server data without tying the client to a specific 
database implementation. Three-tier systems move the 
interface between the application and the database to a 
central server. The Tactica Corporation has a commer- 
cially available product, Caprera, which supports 
off-line long-lived transactions and three-tiered access 
to databases [La96]. 


Our Approach 


In a mobile database application, mobile transac- 
tion middleware provides mobile connectivity and 
mobile data management. The mobile middleware 
provides support for: 

° a wire-efficient access protocol 

© object caching and replication 

¢ logging of deferred transactions 

®*a server-side object server to reduce the 

frequency and duration of slow-link connections 
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It is not sufficient simply to extend database query 
capability to the mobile client. There must be services 
to manage the data for mobile use. 

Gold Rush mobile data management is based on 
Java objects. Java has attracted wide interest because it 
facilitates cross-platform deployment. Furthermore, the 
Java Remote Method Invocation (RMI) API [Su96] 
supports remote method-call and _ object-shipping 
paradigms, which are useful for both connected and 
disconnected operations. Java technology is very well 
suited for mobile database-access applications. 

An objective of Gold Rush ts to make enterprise 
data available to Java applications. Enterprise data is 
most likely to be found in relational databases, VSAM 
data sets, or IMS databases; a very small portion of 
such data is in object databases. Typically, a one-time 
conversion of these relational data bases into object 
data bases is not possible because of other existing 
applications that regenerate and alter the data stored in 
them. Our current prototype supports mappings between 
relational data base tuples and Java objects. 

In a connected environment, one can use a remote 
method call to access an enterprise database or to 
download an object for temporary caching. In an 
occasionally connected environment, one must first 
download Java classes and data to the Java-enabled 
client. Java applications or trusted applets can then 
support disconnected operations through locally persis- 
tent objects. Application code can manipulate local and 
remote objects uniformly, through the same object 
interface. 

The Gold Rush three-tier architecture consists of a 
Java client, an intermediate mobile object server, and a 
back-end data store (see Figure 1). The mobile 
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middleware resides partly on the client and partly on the 
intermediate server. The middleware presents the client 
application with the same transaction API regardless of 
connection mode, except that the database cannot be 
queried in disconnected mode. Thus, a method call that 
would obtain a service directly from the server in 
connected mode invokes middleware that transparently 
performs that service locally in disconnected mode, 
using locally available resources. 

This is not to say that the application programmer 
is oblivious to the mobile nature of the application. The 
parts of the application that are specifically mobile and 
must be exposed to the programming interface include 
the handling of the modes of connectivity, prefetching 
and downloading objects, controlling the replay of 
transactions, and_ resolving’ conflicts during 
reconnection. 

The Gold Rush middleware has the following basic 
components (see Figure 2): 

© Database objects: A database object is a Java 
object that represents an instance of a database 
entity. 

@bject caching: To support disconnected trans- 
actions on data base objects, we provide a persis- 
tent local object store on the client. 


e Transactions with optimistic concurrency 
control: The client works primarily off-line on 
data objects in the mobile device. Transactions 


are logged on the client side and replayed to the 
server when connection is established. An object 
read by a transaction may be either locked with 
an optimistic read lock or left unlocked. An 
object written by a transaction is locked with an 
optimistic write lock. 
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Figure 1. Our three-tiered architecture 
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Figure 2. Mobile middleware components 


¢ Communication: Database objects are trans- 
ferred between client and server using RMI and 
Sun's object serialization mechanism. To 
optimize this communication, we keep track of 
objects known to be present both at the client and 
at the server and transmit only the differences 
between the object version to be transmitted and 
the version, if any, known to be stored remotely. 
The following sections describe the correspondence 
between database objects and relational database 
entities, the persistent client object store, the off-line 
transaction model, and optimization of communication. 


Correspondence Between 
Relational Data Bases and Java Objects 


A database object is a Java object that corresponds 
to arow in a relational database table. Each such object 
belongs to a subclass of a Java class named Entity. 
Each such subclass corresponds to a table of the 
relational database. 

In a relational database, tuples in different tables 
are related through primary and foreign keys. We 
distinguish among 1-1, 1-n, and m-n relationships. (1-1 
relationships are special cases of 1-n relationships.) Ifa 
1-n relationship exists between two object classes, then 
the table corresponding to one object class must have a 
foreign key into the table corresponding to the other 
class. If there is an m-n relationship between two object 
classes, then there must be a third table with foreign 
keys into both of the tables corresponding to the related 
classes. 

Our system provides methods to retrieve database 
objects that satisfy queries, for example, a query on 
foreign keys, when the client is connected to the server. 
To allow retrieval of these collections when the client is 


disconnected, we provide methods to associate names 
with collections. These named collections are persistent 
at the client. 

An application using our system would include a 
layer to insulate the details of relational database 
storage, such as foreign keys, from the manipulation of 
the objects themselves. Such a layer would provide a 
subclass of Entity for each kind of database object, 
defining the object’s properties and its methods. This 
class would also supply methods for navigation among 
objects, using the Gold Rush query facility, and associ- 
ate unique names with collections to allow the associa- 
tion between objects to persist. Finally, the application 
would supply a data manager class for each subclass of 
Entity, establishing the correspondence between 
object properties and database fields and implementing 
the database retrieval and store function. 

We have written a tool that allows the application 
developer to map relational data to object classes. This 
tool generates the code that defines the classes of 
database objects; the code needed to instantiate the 
objects from tuples in the relational database and write 
object instances into the relational database; and the 
code that navigates between related object instances. 

The mapping tool 1s a Lotus Notes application. We 
chose Lotus Notes because it offers flexible storage to 
represent the association between objects’ attributes and 
fields in tuples, a fast way to develop the user interface 
through which the programmer establishes this associa- 
tion, and a sufficiently powerful programming language 
to support the code generation. 

It is possible to map a subset of one tuple to a 
subset of an object. In practice, however, an entire 
tuple is typically mapped to an entire object. The 
mapping tool does not support the mapping of an object 
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to multiple tuples, whether from the same table or from 
multiple tables. 

The tool allows the relationships between tuples to 
be reflected in relationships between object instances, 
through generated code which allows the application to 
retrieve related object instances, or establish relations 
among object instances. Code is generated for setXXx 
and getXXX methods (where XXX represents an attribute 
of a database object) in both classes corresponding to a 
relationship. A getXXX method retrieves the associated 
object instances (one instance in the case of 1-1 
relationships); a setXXX method establishes such 
relations. 

For each subclass of Entity, the tool generates a 
corresponding JDBC-based data-manager class that 
provides SQL statements for retrieval, insertion, and 
update. Each data-manager class provides a method 
that retrieves data from a particular database table based 
on an SQL query and returns a Java collection object 
containing one Java database object for each row in the 
relational database that satisfied the query. There is 
also a method to retrieve an object given its unique 
object ID. The class manager’s insertion method takes 
a Java database object and inserts a new row into the 
database, checking that uniqueness constraints are not 
violated. The data manager’s update method takes a 
Java object and replaces one row in the relational 
database with the data found in the object, checking for 
update conflicts. 


Persistent Client Store 


We use Sun’s object serialization as the principle 
means of generating a persistent form of object. To 
further control the persistent state, and to improve 
efficiency, we have implemented the serialization 
methods readObject and writeObject selectively on 
certain complex internal objects. 

To cache persistent data on the mobile client 
device, we create a small single-user persistent store. 
This object store provides object lookup, store, update, 
and retrieval functions for the objects used by applica- 
tions in off-line transactions. The main operations of 
the store are: 

© Caching: Data is cached into the client’s persis- 

tent store during connected transaction process- 
ing. The latest version of an object is saved in 
the store in preparation for subsequent discon- 
nected operations. (The communication optimi- 
zations to be discussed later prevent the 
redundant transmission of an object already resid- 
ing in the client’s store.) 

e Retrieval: The application code residing on the 

client reads objects from the persistent store 
while the client is disconnected. An application 
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may retrieve an object by calling a method that 
retrieves the latest version of an object with a 
given ID, or by calling a method that returns a 
collection containing all the locally stored objects 
of a particular class. 

° Committal: All mutated objects in the committed 
transaction are stored and updated. A transac- 
tion record is created and the information 1s 
stored in the transaction log for future replay. 

e Replay: The client retrieves the transaction log 
and replays it to the master database, guided by 
the dependency relationships among the transac- 
tions. After replay, the persistent store cleans up 
itself by removing stale and old versions of 
objects as well as old transaction information. 

Since the client store is a_ single-user, 

one-application-at-a-time persistent store, we choose a 
file-based design with careful write sequencing to 
guarantee that the on-disk data is always in a consistent 
state. The store consists of the following kinds of files: 

° Class files: Objects of the same class are stored 

in a single file. This file contains all versions of 
all objects of a given class resident on the client. 
A new version of an object is written when a 
transaction modifying that object is committed. 
Class index files: | A separate index file is 
created for each class. The index provides fast 
look up of object by object ID and transaction ID. 
Transaction files: A file is created for each 
transaction when that transaction is committed. 
This file identifies the exact version of each 
object involved in the transaction and also identi- 
fies the other transactions upon which this trans- 
action depends. 

Transaction log: There is one file containing a 
record of each transaction in the system. The log 
entry for a given transaction includes the name 
and state of the transaction as well as information 
concerning transactions on which the given trans- 
action depends. The major states of a transaction 
are locally committed, remotely committed, and 
aborted. 


Off-Line Transactional Semantics 
and Disconnected Transactions 


When transactions are run while the client is 
connected, locks are held in the database and the trans- 
action runs in the traditional way. We will not describe 
the connected mode of operation in this paper. While 
the client is disconnected from the server, locks are not 
held in the database and the system runs in a “lazy” 
mode similar to that described in [GH96]. Multiple 
transactions can be run against objects resident in the 
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client’s local store. When the client reconnects, the 
transactions are replayed on the server. 

We perform disconnected transactions using the 
latest version of each object resident in the client’s local 
store, and save the results in the transaction log. In 
addition, the changed version of each object modified 
by the transaction is saved. When the client reconnects, 
the transaction log is replayed to the server and the final 
commit to the database is attempted. The initial execu- 
tion is called a /ocal commit and the replay is called a 
remote commit. Locks are granted to the client optimis- 
tically before disconnection. These locks are used to 
detect conflicts during remote commit. 

To support conflict detection, each object has an 
object ID and a timestamp, which are stored in the 
database. The object ID is generated when the object is 
created. It is unique, is not modifiable, and is a key of 
the database table. The object timestamp is unique and 
is generated locally on the client at the time of commit. 
It consists of a unique user number concatenated with a 
local clock reading and a counter to distinguish among 
objects created during a single tick of the clock. 

Gold Rush provides read locks and write locks with 
the usual semantics (shared read and exclusive write) 
[Da90]. These locks are not checked during discon- 
nected execution, since transactions proceed in strict 
serial order on the client, but they are checked during 
transaction replay (remote commit). It is also possible 
to read an object without locking it, and without check- 
ing for currency when the transaction is replayed. 

Reading without locking reduces the amount of 
lock contention. If a transaction reads an object without 
locking it, the transaction can commit successfully even 
if the version of the object read by the transaction is 
obsolete at commit time. Reading without locking is 
useful when it is known that the attributes actually used 
by the application in a read object (for example, the 
name and serial number in an Employee object) are 
unlikely to change even though other attributes in the 
object (for example, the employee’s accrued vacation 
time) may change. Reading without locking is also 
useful when an approximate answer is sufficient to 
satisfy application requirements. 

A transaction is started on the client when the 
application calls a beginTransaction method. After 
that other methods can be called to register particular 
objects with the current transaction. The application can 
use and modify any registered objects as it chooses and 
eventually call a commit method, closing the current 
transaction. When commit is called in disconnected 
mode, each modified object is written to the proper 
class store. A transaction file is created with references 
to each of the locked objects in the transaction. The 
transaction log is extended to include this new transac- 


tion file. At this point the transaction’s state is locally 
committed. 

When the client eventually reconnects to the server, 
it serially reads all locally committed transactions and 
replays the transactions to the server. Each of these 
transactions creates an equivalent transaction on the 
server. The set of locked read objects and the set of 
write objects are checked for conflicts and the database 
updated if no conflicts are detected. If no conflicts are 
detected, the transaction succeeds, and is marked 
remotely committed. 

To perform conflict detection the system tracks the 
following information: 

¢ On the server, each database tuple includes the 
object ID and the timestamp of the last update. 
This timestamp is called the /ast-modified time. 

e On the client, each modified object includes two 
timestamps. One is the last-modified time and the 
other is the /ocal-commit time. 

When an object is first read by a transaction, the /ast- 
modified timestamp is set to the timestamp of the last 
local commit that modified the object, or initially to the 
database timestamp. When the transaction commits, the 
local-commit timestamp is set and the object is written 
to disk. When the transaction is replayed during remote 
commit, the object’s Jast-modified timestamp is 
compared to the database tuple’s timestamp. If they are 
the same and the object has been modified, the object 
will replace the database tuple and the /ast-modified 
timestamp in the database will be changed to the client 
object’s /ocal-commit timestamp. If the object has not 
been modified, the replay proceeds to the next object in 
the transaction. If the /ast-modified timestamps are 
different for any object in the transaction, the transac- 
tion is rejected. (No global clocks are required, because 
timestamps are compared only for equality, not order, 
and each timestamp includes a user number unique to its 
client.) 


Reducing Data Traffic Between 
Client and Server 


When a mobile client connects to the server, the 
connection may be over a slow and expensive link such 
as a cellular phone connection. Therefore, it is impor- 
tant to minimize the amount of data exchanged between 
the client and the server, even at the cost of additional 
computation and additional storage requirements. 

We reduce traffic between the client and server by 
maintaining mirrored directories of objects known to be 
stored on both the client and the server. There is one 
such directory on each client machine and one mirrored 
directory per client on the server machine. Each direc- 
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tory entry contains an object ID and a reference to a 
local copy of the corresponding object. 

Before an object is transmitted remotely, we check 
whether its object ID is in the local copy of the direc- 
tory. If not, we transmit the entire object and—since 
the object is now stored on both the client and the 
server—add a corresponding directory entry to each 
copy of the directory. If the object ID is already in the 
local directory, we compare the timestamp of the object 
referenced in the directory with the timestamp of the 
object to be transmitted. If the timestamps are the 
same, we transmit only the object ID. If the timestamps 
are different, we transmit both the object ID and a 
succinct representation of the differences between the 
version of the object to be transmitted and the version 
referenced by the directory entry. 

We rely on RMI for the actual transport of data 
between client and server. Entities are transmitted from 
the server to the client only as the function result of an 
RMI call by the client asking for an object with a 
particular object ID, or as elements of the function 
result of an RMI call by the client asking for a vector of 
objects satisfying a particular SQL query. Entities are 
transmitted from the client to the server only as 
elements of a parameter of an RMI call asking the 
server to commit a particular transaction. 

We do not tamper with the internal mechanisms of 
RMI to take advantage of our mirrored directories. 
Rather, we use an abstract class RemoteEntity in place 
of the class Entity in the parameters and function 


Requestor Client-side 
of services communications 
on client optimization 


Entity objects in parameters 


results of the RMI call. This abstract class has three 
subclasses providing concrete implementations: 

e FullRemoteEntity. This class carries all the 
information contained in an entity. 

° 0idOnlyRemoteEntity. An object of this class 
contains only the object ID ofan entity. 

° DeltaRemoteEntity. An object of this class 
contains only the object ID of a base entity and a 
succinct description of the changes that must be 
applied to the base entity to obtain the desired 
target entity. 

The remote method for committing a transaction is 
called through an interface that accepts vectors of 
entities participating in the transaction, constructs 
vectors in which each entity is replaced a remote entity 
of the appropriate form (based on whether that entity is 
present in the mirrored directories), and passes these 
vectors to the remote method. The server converts 
these vectors back to their original form using its copy 
of the mirrored directory and performs the commit 
operation. Similarly, when the client calls a remote 
method to obtain a particular entity or vector of entities, 
it does so through an interface that will convert the 
remote entities returned by the RMI call into ordinary 
entities, using the client copy of the mirrored directory. 
The remote method itself, executed at the server, first 
performs the necessary database operations to construct 
the result entity or result vector, then constructs a corre- 
sponding remote entity or vector of remote entities 
based on its copy of the mirrored directory and retums 
that object as the result of the RMI call. (See Figure 3.) 
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Figure 3. Protocol for optimized RMI calls. 


Conference on Object-Oriented Technologies and Systems - June 16-20, 1997 


result 


(RMI return) 


USENIX Association 


USENIX Association 


Because we control the amount of data passed in 
remote calls rather than the mechanisms by which 
remote calls transmit their data, the fact that we are 
using RMI is incidental to our approach. The same 
approach could be used if we were to replace RMI calls 
with IIOP/CORBA calls. Indeed, by concentrating the 
bulk of our remote calls in the part of Gold Rush 
responsible for traffic reduction, we have encapsulated 
our decision to use RMI. Very little of our code would 
have to change if we were to decide at some point to use 
ITOP instead. 


The Role of the Application Code 


A system that uses Gold Rush contains application 
code in both the client and the server. In the server, the 
application code implements the interface between Gold 
Rush and the persistent repository, typically a relational 
database. For this interface the application supplies 
definitions of the objects and methods for: 

e querying the attributes of a single object; 

e setting the attributes of a single object; and 

e retrieving of a set of objects, all of the same 

class, satisfying a query clause. 
In addition, the application must supply methods to 
resolve conflicts. We also envision that the application 
code at the server could run more complex requests, 
perhaps activating an agent to execute autonomously 
on behalf of the client, and perhaps yielding a result 
consisting of objects of various classes. 

At the client, Gold Rush provides methods _ for 
transaction start, commit and rollback, object creation, 
and lock upgrade. There are also methods for a 
connected client to create collections of objects of a 
given class satisfying a given query. Finally, there are 
primitives for associating names with such collections 
and making the collections persistent at the client. The 
application must supply methods to navigate between 
related objects. 


Future Directions 


We have implemented a prototype version of Gold 
Rush. Gold Rush is now being considered for integra- 
tion into a large business-object framework. Its inter- 
faces were designed to facilitate this integration. Issues 
that we did not address in the prototype but are impor- 
tant in a production system, such as data security, would 
be addressed during this integration. 

We are aware of a number of ways in which we can 
improve the performance and flexibility of Gold Rush. 
We expect to incorporate these improvements in our 
future work. 

Currently, the client is notified when a conflict is 
detected during an attempt to perform a remote commit. 


However, any actions to recover from the conflict, for 
example by merging updates or retrying a transaction 
with fresh data, must be programmed explicitly. Future 
versions of Gold Rush will include a framework for 
specifying conflict-resolution strategies. It will be 
possible to specify strategies both on a class-by-class 
basis and on a scenario-by-scenario basis. 

Our current mapping between relational data bases 
and objects entails the addition of columns to the 
relational database tables to hold object ID and time- 
stamp data for use in conflict detection. This is incon- 
venient and sometimes unacceptable when dealing with 
legacy data bases. We are formulating techniques that 
will allow access to legacy data bases without changing 
the format of those data bases. 

The current system requires that all data be 
preloaded before disconnection. This requires careful 
planning by the user to avoid being stranded without the 
data needed to complete one’s work. We are investigat- 
ing techniques to dynamically connect to the server and 
fetch missing data. 

Our prototype presumes that the client middleware 
is invoked by only one application at a time, and that all 
of the application’s invocations of middleware methods 
are performed by a single thread. Thus there is no 
concurrency control in the local-commit logic. This is 
not an inherent limitation in our approach, merely a 
simplifying assumption made for the prototype version. 
We plan to rewrite the client middleware to make it 
thread-safe, so that the mobile worker can run several 
applications at a time and so that a client application 
programmer can take advantage of Java threads. 

Our current mobile object server utilizes one active 
remote-object-server object for every client regardless 
of whether the client is connected or disconnected. This 
naive approach supports high concurrency but requires 
large number of thread resources, potentially over a 
long period of time with huge number of clients. We 
plan to investigate simple activation approaches such as 
the one described by Wollrath [WW95] to reuse 
remote-object-server objects if possible and to activate 
and deactivate persistent remote objects with low 
overhead. 

Several of our strategies for reducing traffic over a 
slow or expensive link entail a large amount of compu- 
tation. Over a sufficiently slow link, the time saved by 
reducing traffic more than makes up for the time 
expended to perform the computation. However, on 
occasions when the client is connected to the server 
over a fast and inexpensive link (as when a mobile user 
retums to the office and connects the client machine 
directly to a LAN), the time saved by reducing traffic is 
negligible, and the computational cost is no longer 
worthwhile. Therefore, we plan to provide controls for 
disabling our computationally intensive _ traffic- 
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reduction strategies. Ultimately, it may be possible to 
monitor the client-server connection and switch modes 
automatically based on the speed and cost of the 
connection and the user’s current level of urgency 
(measured as the amount of extra money the user is 
willing to pay to speed up transmission). 

In our present architecture, all application-specific 
algorithms (except for the formulaic methods that our 
tools generate to translate between object-oriented and 
relational data bases) reside on the client. These 
algorithms communicate with the server through 
simple-minded requests to fetch an object with a given 
ID, to fetch a collection of objects satisfying a given 
query, or to commit a transaction remotely. Some 
algorithms (executable only in connected mode) may 
involve an extended dialogue between the client and 
server, in which the client requests some data and, 
based on the contents of that data, issues further 
requests. In such cases, traffic between the client and 
server could be substantially reduced by allowing the 
client to issue high-level application-defined requests to 
the server. These requests would invoke application- 
specific algorithms at the server and deliver results to 
the client. A server-based algorithm might entail a long 
series of database queries and updates, but these would 
all be performed locally on the server. Only the initial 
high-level request and the final result would have to be 
communicated, producing substantial savings when the 
connection is over a slow or expensive link. The 
server-based algorithm could even be executed autono- 
mously by an agent acting on behalf of a disconnected 
client. The client would retrieve the result of the 
autonomous computation upon reconnection. 


Conclusion 


Mobile applications in Java can easily be ported to 
other platforms, and can exploit Java’s strong support 
for distributed applications. Gold Rush allows mobile 
client code written in Java to access data stored in enter- 
prise relational data bases. These applications deal with 
Java objects corresponding to rows of relational 
database tables, belonging to classes that correspond to 
tables. Attributes of these objects reflect the 1-1, 1-n, 
and m-n_ relationships among _relational-database 
entities. We have developed tools that automatically 
generate the required Java classes and translate between 
the object and relational views of the data. 

Unlike other systems allowing relational-database 
entities to be manipulated as Java objects, Gold Rush 
allows users to cache objects off-line on the mobile 
client and then disconnect, obviating the need for a 
continual, potentially expensive, link to a central server. 
A client also has the option of running in disconnected 
mode when a slow link is available, to avoid 
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communication delays. Unlike systems that replicate a 
subset of a relational database on the client, our archi- 
tecture confines all manipulation of relational data bases 
to central servers. The client deals purely with Java 
objects and can remain lightweight, using the same 
object interface for both connected and disconnected 
transactions. 

To guarantee atomicity of updates and the integrity 
of both the central database and the data stores on 
individual clients, we group updates into transactions. 
While the client 1s connected, transactions can be run 
directly on the server. While the client is disconnected, 
transactions are constructed and saved locally on the 
client and replayed to the server upon reconnection. 
Objects participating in transactions can be locked 
optimistically, which allows other clients, or back-office 
applications, to use the same data, but makes it neces- 
sary to check for conflicts when client transactions are 
replayed to the server. In case of conflict, the central 
database is not updated and the client is notified of the 
failure. A conflict-resolution mechanism currently 
under design will allow the client to take appropriate 
actions to recover from the rejection of a transaction 
that had been tentatively committed. 

In addition to reducing the need for communication 
between client and server to the initial caching of 
objects and the replaying of transactions, we streamline 
those communications that are necessary. By reducing 
the amount of data that must be transmitted, we make 
the use of mobile clients more economical and practical. 
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Abstract This paper introduces a thin-client pro- 
gramming model and then presents an object- 
oriented framework for developing applications using 
the model. The programming model and the frame- 
work have evolved from interactions with developers 
and users of commercial applications. The key as- 
pects of the thin-client programming model are that 
the client downloads application front ends from the 
network; that these applications rely only on ser- 
vices found on network servers; that the services 
are bound as late as possible; and that the appli- 
cations interact with each other within the confines 
of a workspace. We implemented the framework us- 
ing Java Beans and JDK 1.1, and developed several 
sample applications using the framework. 


1 Introduction 


Fueled by Java?™ and other Internet technologies, 
new re-engineering efforts are underway to develop 
commercial applications using a thin-client program- 
ming model. In a thin-client programming model, 
the software client would be substantially thinner 
in that it contains only the graphical user interface 
(GUI) and a small amount of essential application 
logic. Most of the application logic runs as services 
on various servers throughout the network. The 
client software is written using Java so that it can 
run on any client hardware. The thin-client model 
is distinct from its hardware counterpart, known in 
the industry as the Network Computer. However, 
the thin-client programming model can be the force 
that makes Network Computers widely deployed. 


An application development paradigm becomes pop- 
ular if appropriate tools are available that enable 
developers to leverage its benefits easily. While 


the Java programming language[1], Java Develop- 
ment Kit (JDK) [2], Java component technology 
[3], and remote access mechanisms [4, 5] enable 
platform-independent programming, they are only 
a set of building blocks. Previous work in object- 
oriented systems suggests that frameworks [6] can 
be a promising way of achieving widespread use and 
reuse of software architecture. Therefore, there is a 
need for a thin-client application framework that is 
capable of bringing together all parts of an applica- 
tion (the front-ends running on the client and the 
services available on network servers) and support- 
ing the whole with system services. Lacking such a 
framework, developers may find it difficult to boot- 
strap themselves into the new paradigm, and they 
might resort to an older and less portable method- 
ology such as the Microsoft Windows environment. 


Metis, the thin-client application framework pre- 
sented in this paper, is arelated, inter-operable set of 
objects that enable robust application development 
in the thin-client paradigm. The goal of Metis is to 
create a fully server-managed environment for an ap- 
plication, as opposed to the traditional client-server 
approach. Towards this end, the framework advo- 
cates and supports a thin-client programming model 
where an application consists of application front 
ends (AFEs) and a collection of backend application- 
specific services. AFEs rely solely on application- 
specific services and system services provided by one 
or more network servers. Thus, AFEs do not depend 
on local operating system functions. AFEs request 
services in an abstract manner without specifying 
the physical location of a service provider. That is, 
a requested service can be any one of the appropri- 
ate service instances available in the network. AFEs 
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bind, on demand, to these network services. The 
late binding of services allows server manageability, 
flexibility, and fault-tolerance. 


Metis provides Java classes on the client side for 
locating and binding to a service instance and 
for switching to an alternate service instance in 
case of a failure. In addition, Metis provides a 
workspace-based client environment suggested by a 
common commercial application characteristic: in- 
teracting sub-applications. The Metis workspace 
hosts and manages a set of sub-applications; each 
sub-application is in the form of an AFE. The work- 
space Manager provides visual tools to customize the 
workspace by adding or deleting AFEs. Workspace 
configuration information is stored on a server. 


In the current implementation, the Metis workspace 
provides the following object instances for use by the 
AFEs, and the list may grow as additional objects 
of common applicability are identified: 


e Service location and binding object; 

e User authentication object; 

« Controller objects for accessing and managing 
system services such as printing and data stor- 
age. 


On the server side, the Metis framework depends on 
support services including an authorization service 
that ensures controlled access to the system, a code 
service that maintains a secure repository of trusted 
AFEs, and a directory service that presents a search- 
able access to services. These support services must 
be fault-tolerant and scalable besides using industry 
standard protocols. Therefore, Metis uses a direc- 
tory service supporting the Light-weight Directory 
Access Protocol (LDAP) [7]. Such directory services 
are likely to become common place and even more 
robust in the future. 


In addition to the above mentioned services, Metis 
requires printing and data-storage services, and a 
mechanism for launching and managing application- 
specific services on various servers. The latter can be 
accomplished, for example, using the servlets mech- 
anism [2]. 


The rest of the paper is organized as follows. Sec- 
tion 2 presents the Metis thin-client programming 
model, sections 3 and 4 describe the Metis frame- 
work and implementation respectively. Section 5 
discusses the related work, and Section 6 concludes 
the paper. 
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2 Thin-Client Programming Model 


The key aspects of the Metis thin-client program- 
ming model are that the client downloads AFEs 
from the network; that these AFEs rely only on ser- 
vices found on network servers; that the services are 
bound as late as possible; and that AFEs interact 
with each other within the confines of a workspace. 
AFEs are securely installed and downloaded using 
a code service. They are also made ‘thin’ by imple- 
menting most of the application logic as one or more 
services. Late binding to these services provides: 


e manageability, because services can be moved 
across server machines without impacting 
AFEs; 

e flexibility, because services can be selected 
based on server load; and 

fault-tolerance, because a service can be ob- 
tained from an alternate server. 


The workspace is a container for AFEs, allowing for 
interaction, as well as providing a shared environ- 
ment. One important part of that environment is 
the authorization information that can be read from 
a smart card or provided as part of a logon process 
from an authentication service. The authorization 
information is used first to determine if a user is al- 
lowed to use the system, and then to identify the 
user’s access rights to available AFEs. Afterwards, 
this information can be used directly by the AFEs 
to authenticate themselves to the service providers. 


Figure 1 outlines the various building blocks of the 
thin-client programming model. It shows three im- 
portant parts — the client workspace, application- 
specific services, and support services needed to pro- 
vide full thin-client functionality. 


2.1 Client Workspace 


The client workspace provides a combination of func- 
tions in Metis. It provides the AFE container func- 
tion, sore of the conventional desktop functions, 
anda wrtual environment of network services. These 
will be discussed in detail in this section. 


Visually, the client workspace has a customizable 
layout that can be configured on a per-user basis 
using configuration information stored on a server. 
When a user logs on, all framework objects are in- 
stantiated. 


User Profile: The user profile includes an autho- 
tization object that contains user information in- 
cluding name and time of logon. The authorization 
object is passed with directory and code service re- 
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Figure 1: The schematic shows the three important parts of the thin-client application model: the client 
workspace, support services, and application-specific services. The client workspace contains user profile 
and authentication objects, objects to find and bind to services, and controllers for data and print. AFEs 


execute in the context of the client workspace. 


quests. These support services recognize the object 
and restrict the user to only those AFEs and services 
that allow access by the user. Note that as long as 
services and AFEs are written to use the authoriza- 
tion object, a single logon procedure is possible. 


Application Front Ends: The AFEs are down- 
loaded to the client from a code server either when 
the workspace is initialized (if they were previ- 
ously active) or when the user activates one on the 
workspace. They can be removed from the work- 
space as needed. Interactions among AFEs, such 
as data exchange and event notifications, are im- 
portant especially when the AFEs are implementing 
sub-applications. Interactions are supported using 
JavaBeans™™ technology. 


Virtual Environment Manager: A virtual en- 
vironment manager (VEM) is a fundamental client 
object provided by Metis. It is the only entity with 
which an AFE can request Metis services. Prior to 
accepting an AFE’s request, the VEM checks the 
AFE’s signature to ensure that it is allowed access 
to the Metis system. As long as the AFE is rec- 
ognized, the request is forwarded to one of the fol- 
lowing VEM clients that act as delegates to Metis 
support services. 


1. Directory Client: The VEM has an inter- 
nal directory client that communicates with the 
Metis Directory Service upon request of a ser- 
vice from an AFE or the workspace. The direc- 
tory client and Metis Directory Service together 
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provide late binding of services. At this time, 
services can be requested by name and service 
attributes though clearly, a higher level proto- 
col can be supported. The directory client re- 
trieves service stubs. If the service stub is an 
object, the directory client instantiates the ob- 
ject and passes a reference back to the caller 
AFE. If the service stub is a location, the di- 
rectory client passes the location back to the 
caller AFE. To reduce the possibility of creat- 
ing a flurry of messages, called a network storm, 
that can be caused by a failure of a widely- 
held service, the directory client retrieves and 
caches more than one service stub, when multi- 
ple providers of the same service are available. 
Should an active service fail, an alternate stub is 
fetched, instantiated if necessary, and returned 


to the AFE. 


2. Service Stub Loader: The VEM has an inter- 
nal service stub loader that communicates with 
the service providers to download any stub code 
that the service might need for providing the 
service. The service stub loader is only used if 
the service stub returned by the directory needs 
to be instantiated. 


AFE Loader/Launcher: The Workspace has an 
internal object that communicates with the Metis 
Code Server to download code and launch AFEs. 
The AFE Loader also checks for digital signatures, 
uncompresses, and decrypts AFEs as needed. En- 
abling a client to download code from a centrally 
administered source makes its use attractive for in- 
tranet environments where the software distribution 
and maintenance on traditional PC clients is expen- 
sive. The AFE Launcher runs the AFE when it is 
selected by the user. 


System Services: AFEs need a common set of 
system services such as printing, storage, and er- 
ror logging. While application services are private 
to each AFE, Metis allows sharing of system ser- 
vices. When an AFE requests a system service, an 
associated controller object is returned. If a stub to 
the requested service does not exist, the controller 
creates one by accessing the Metis Directory Ser- 
vice to indicate where the stub class is. The class 
is downloaded, and instantiated. The controlier, in 
turn, manages the stub instances. For example, in 
Metis a print controller is a single point of access 
used by all to 1) create access to specific printers 
and 2) send information to them. Most of the con- 
trollers are provided to support the AFEs. However, 
the data controller is also used by the Metis work- 
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space to access user configuration. 


2.2 Application-Specific Services 


The Metis design imposes minimal requirements on 
service developers. They are free to implement ser- 
vices in any language. Communication between the 
service and the client-resident service stub can use 
any protocol, e.g., IIOP [5], RMI [4], or a private 
protocol. The service providers may provide a client- 
side stub with a well-known interface for access to its 
services, or, may only provide a location for AFEs 
to access services using a mutually understood pro- 
tocol. The AFEs and service providers may use the 
authorization object provided by Metis for authen- 
tication purposes. 


To better integrate services into the network, Metis 
provides a tool that can be used to register the ser- 
vices with the Metis Directory Service. For AFEs to 
dynamically access services, they must be registered 
with the directory. Registering a service makes it 
immediately available. Removing a service from the 
Metis Directory Service does not impact AFEs cur- 
rently using the service. However, when an AFE de- 
tects that the service is no longer available, i.e., the 
service was removed from the server, it can fail-over 
to another service registered in the Metis Directory 
Service. 


2.3 Metis Support Services 


Metis has a number of server components perform- 
ing distinct functions, While these services are not 
fundamentally part of the Metis thin-client program- 
ming model, they are needed to support that model. 
For example, the model states that AFEs can bind 
to any service available on the network that meets 
its requirements. To have the capability to find all 
such services, a directory service is used. 


Metis Directory Service: The Metis Directory 
Service accepts queries from the directory client and 
sends results back to the client. It does not manage 
the physical service directory. Instead, the Metis 
Directory Service acts as a client to an LDAP [7] 
directory server. LDAP is an emerging standard 
in distributed directory services offering reliabil- 
ity and scalability. Each service in the directory 
has a unique service location and a number of at- 
tribute/value pairs. Each service must at least have 
a name attribute with a non-empty value. Services 
can be looked up by a name and a search filter 
composed of a boolean expression of attribute/value 
pairs. The Metis Directory Service can also perform 
access control via LDAP with the authorization ob- 
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ject that the directory client supplies. The Metis 
Directory Service allows service providers and code- 
server administrators to add, modify, or delete ser- 
vices in the directory. 


The design of the Metis Directory Service simpli- 
fies ports to other directory technologies supporting 
both current and emerging network directory stan- 
dards. It also can be easily enhanced to provide 
intelligence to the service selection process. As men- 
tioned earlier, AFEs must currently know the given 
name of a service and important attributes as well as 
their correct values. For example, if an AFE wanted 
to access a color printer service, in the present design 
it would have to ask for one by name, e.g., ColPrt2. 
In the future, it might want to ask for a printing 
service that is physically close, e.g., nearby & color. 


Metis Code Server: The Metis Code Service is an 
AFE repository. It provides central administration 
of client code and lends itself to be a tuner for a 
third-party code service that uses Marimba [8]. The 
Metis Code Server can perform access control using 
the authorization object. It can also digitally sign 
AFEs to identify them as part of the Metis system. 
Like the Metis Directory Service, the Metis Code 
Service is independent of the actual code distribu- 
tion mechanism. 


Metis Authorization Service: The Metis au- 
thorization process can either be completely self- 
contained, e.g., all authorization information is 
present on the user’s smart card, or it can use a 
standard authorization service such as Kerberos [9]. 
A reference implementation for the Metis Authoriza- 
tion Service is being developed. 


System Servers: Controllers provided by Metis act 
as the contact points for actual system services avail- 
able on the network. In particular, a data-store ser- 
vice and printing service must be available on one 
or (preferably) more servers, and must be registered 
with the Metis Directory Service. These services re- 
ceive requests from their associated controllers and 
return responses. 


3 Metis Framework 


The thin-client programming model described in the 
previous section recommends a general software ar- 
chitecture that allows applications to be written 
flexibly and to run within a manageable and fault- 
tolerant environment. However, to build these ap- 
plications from scratch would be very difficult. The 
Metis framework, shown in Figure 2, presents Java 
classes and interfaces needed to make programming 


these applications easier. 


The Metis framework provides instances of the 
key objects discussed in the programming model 
including the workspace, user profile, AFE 
loader/launcher, VEM, directory client, service-stub 
loader, and controller objects. It also manages AFE 
objects. In addition, interfaces are provided to as- 
sist the application developer with the task of writ- 
ing a thin-client application. These interfaces can 
be grouped as follows: 


e AFE integration into the workspace 

e AFE access to services 

e Metis System Services interfaces 

e Metis Support Services interfaces 

The following subsections discuss the interfaces 
shown in Figure 2. 


3.1 AFE Integration into the Workspace 


One purpose of the workspace is to contain and 
launch AFEs. A Metis user indicates which AFEs 
the workspace contains by using a “ToolBox” that 
provides the user with a list of all AFEs that he is 
entitled to use. When the user selects an AFE to add 
to the workspace, the Toolbox returns the location 
of the AFE class. The workspace passes the location 
to the code service client to download the class from 
the Metis Code Server. The icon associated with 
the class is accessed and added to the workspace 
as a button. To launch the AFE, the user double 
clicks the button. An AFE can be removed from the 
workspace container by removing the button. If it is 
active, then the AFE is destroyed. The workspace 
configuration is automatically saved when the user 
logs out. 


Another purpose of the workspace is to provide re- 
sources to all AFEs. These resources include the 
user profile object and the controllers. To access 
these resources, the AFE is required to get a ref- 
erence to the workspace object by implementing 
AFEInterface. When activating an AFE, the work- 
space calls the setWorkspace() method on the AFE, 
passing a reference to the workspace object. 


3.2 AFE Access to Services 


To access services, AFEs use the interface called 
the VEMInterface implemented by the workspace 
object. The workspace object delegates the VEM- 
Interface calls to its member VEM object, that truly 
implements the VEMInterface. 


To bind to a service, the AFE calls the request- 
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Figure 2: The figure shows important classes in Metis and their relationships. AFEs acquire a handle 
to the Workspace by implementing the AFEInterface. Workspace implements the VEMInterface and the 
ControllerInterface, and thus provides the necessary support for AFEs. Workspace uses delegation (to the 
VEM object) in implementing the VEMInterface. The workspace object contains VEM, controller, and 
UserProfile objects, and manages AFE instances. 
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Service() method on its workspace reference. The 
requestService() method takes the AFE instance, 
the name of the service, and a preferred list of at- 
tribute/value pairs as arguments. A Metis object, 
called a Servicelnfo, is returned that contains an in- 
stance of the service stub or, if desired, the service 
location. All information needed to access the ser- 
vice is now available to the AFE. Should a service 
fail while being used, the AFE can re-request a new 
service by calling the recoverService() method on the 
workspace, passing a reference to itself and the ac- 
tive Servicelnfo object. 


3.3. Metis System Services Interfaces 


Metis defines a generic Controller class for manag- 
ing instances of objects that provide shared system 
services to all AFEs on the workspace. The Con- 
troller class provides a graphical user interface for 
adding and removing system-service instances from 
the workspace and for allowing the user to select 
a specific instance to provide the service. Service- 
specific controllers extend the Controller class and 
implement service interfaces to provide the desired 
service. The services are explicitly configured using 
a GUI provided by the Controller. A user can also 
setup a specific instance of the service provider as a 
default for that service. 


To access the system services, an AFE calls the get- 
Controller() method. For example, should the AFE 
need access to the data store, it calls the getCon- 
troller() method, passing it the type of system ser- 
vice required, e.g., ‘DATASTORE’. The VEM re- 
turns the instance of the appropriate controller for 
that service. The AFE then uses the instance when 
it needs system services. The Controller in turn, del- 
egates the request to the default service provider, or 
allows the user to select a service provider from the 
list of configured providers. 


Metis also provides reference APIs for system ser- 
vices such as printing and data storage. The printer 
API allow an AFE to print, check the status or re- 
move a print job. The data storage API allow an 
AFE to store and retrieve files from a data store. 


3.4 Metis Support Services Interfaces 


The Metis Support Services interfaces are not visi- 
ble to the application programmer. They are hidden 
within the VEM, controllers, and the user profile. At 
this time, three key interfaces have been identified. 
The DirectoryInterface has been designed and imple- 
mented. The CodeServiceInterface and the MetisSe- 
curityInterface are being developed. 


The directory client uses the DirectoryInterface to 
avoid being tied to a single directory protocol. Any 
physical directory service can transparently plug 
into Metis simply by enhancing the Metis Directory 
Server with updated implementations of the Direc- 
toryInterface. The interface has been kept small, 
containing only one method, to minimize the com- 
plexity of directory access. When the VEM client 
is requested to access the Metis Directory Server 
(e.g., an AFE called the requestService() method), 
the VEM calls the lookUpService() method on the 
directory client. The authorization object, name, 
and attributes passed to the VEM are passed as ar- 
guments to the lookUpService() call. 


4 Metis Implementation 


This section gives the highlights of the implementa- 
tion for both the Metis framework and the AFE suite 
used to demonstrate it. The discussion of the Metis 
framework implementation will be restricted to the 
workspace implementation and the service access in- 
terfaces, i.e., the VEMInterface and the DirectoryIn- 
terface implementations, along with important util- 
ity objects that are used. The discussion of the AFE 
implementations will center on what specifically was 
done to integrate the applications into the thin-client 
programming model. Overall, the Metis framework 
is about 7000 lines of Java code. 


4.1 Metis Workspace 


The reference Metis implementation supports Java 
bean behavior, to allow a natural mechanism for 
AFEs to interact within the workspace. The Sun 
Microsystems reference implementation of the Bean- 
Box is the basis of the Metis workspace. The Bean- 
Box provides bean containment, and support for 
property sheets, visual manipulation of beans, and 
state storage. 


At workspace startup, a logon process verifies the 
user and creates the user profile object if the verifi- 
cation is successful. This process includes showing a 
dialog box for user name and password information 
and checking it against an authorized user list. The 
user is also allowed to use a “smart card”? instead of 
providing information directly to the logon dialog. If 
the user passes verification, the VEM and controller 


1Unfortunately, some of these functions (e.g., event inter- 
actions, property sheets, and pickling) had to be disabled to 
get the base functions working since the version of JDK1.1 
(Beta 3) was unstable. These functions will be reintegrated 
as JDK1.1 gets more mature. 

2A floppy drive was used as a place holder to smart card 
hardware. 


Conference on Object-Oriented Technologies and Systems - June 16-20, 1997 


109 


110 


objects are instantiated. 


In Metis, the workspace is populated with buttons, 
each showing the icon of the AFE. Each button con- 
tains the URL of an applet and an appletviewer 
instance with a reference to the applet. The ap- 
pletviewer has been modified from the JDK imple- 
mentation to separate the applet instantiation from 
its execution. When a button is selected, the ap- 
pletviewer initiates the applet execution. The initial 
layout on the workspace is read from the user’s con- 
figuration file, accessed from a data store. 


The ToolBox has been completely rewritten from the 
BeanBox implementation. In our implementation, 
the ToolBox queries the directory server to get a list 
of AFEs that the user can access. It allows a user to 
add an AFE as a button to the workspace, and when 
the button is added the AFE loader downloads the 
AFE code from the Metis Code Server. 


4.2 VEMInterface Implementation 


An AFE requests a service by calling the requestSer- 
vice() method on its workspace reference. The AFE 
passes a reference to itself, the well-known name of 
the service, and a special Filter object. Passing the 
requester of the service enables the VEM to keep 
state on each AFE. It also allows application de- 
signers to, in effect, build a service providing ob- 
ject that gets services on behalf of all AFEs in the 
suite. The Filter object was designed to simplify 
the attribute/value logic specification. At this time, 
only the “AND” function is provided, e.g., a service 
must have “attrl=vall” & “attr2—val2”. The Filter 
class will be enhanced in the future to allow arbi- 
trary logic specifications. An example code segment 
showing service request is: 


Filter filter = new Filter(); 
filter.addElement ("height","low") ; 
filter .addElement ("speed","slow'); 
workspace.requestService(this, 
“my_service", filter); 


The requestService() implementation in the VEM 
class does the following list of actions. Note that 
not all exception paths are included in the list. 


1. If a null filter was passed in, the VEM 
creates one. A special attribute/value pair 
(‘type’/‘Service’) is added. The type attribute 
was added because the Metis Directory Service 
contains both services and AFEs. 
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2. The VEM checks its internal state to see if the 
AFE is a known service user. 


3. If the AFE is not a known service user, its au- 
thorization is checked to ensure that the AFE 
has the credentials needed to use the Metis sys- 
tem. If it passes the check, it is added to the 
VEM internal state. Otherwise, the requestSer- 
vice() throws an exception. 


4. The VEM checks its internal state to see if the 
AFE has previously asked for the same service. 
If so, it returns the associated ServicelInfo ob- 
ject. 


5. The lookUpService() method is called on the 
directory client and the returned list is wrapped 
in a Servicelnfo object. 


6. The VEM checks the validity of the ServiceInfo 
object. The object will be invalid if the direc- 
tory search could not find any services, e.g., if 
the service was unknown or if the user did not 
have access to the those registered. An excep- 
tion is thrown if the ServicelInfo is invalid. A 
timestamp is also added to the new Servicelnfo 
object. The ServiceInfo object is then added 
to the VEM internal state and returned to the 
AFE. 


The object class, ServiceInfo, returned by the re- 
questService() method maintains the information re- 
turned by the directory lookup. For the reference 
implementation, the locations of the services are 
stored within the ServiceInfo object as URLs. The 
class provides accessor methods for AFEs to use. 


The AFE uses the Servicelnfo object to get to the 
service. During its use, the service could fail. The 
AFE, when it detects such an occurrence, can re- 
quest a replacement service by calling the recover- 
Service() method on its workspace, passing it a ref- 
erence to itself, and the ServiceInfo it used to get 
the previous service. The workspace delegates the 
request to its VEM object. The VEM class’s recov- 
erservice() implementation does the following list of 
actions. Again, not all exception paths are included. 


1. Add the currently accessed service to the black- 
listed services maintained in the Servicelnfo ob- 
ject. This blacklist is simply a means of tracking 
which of the services returned by the directory 
lookup have been used, and failed. 


2. See if there is another service known in the Ser- 
vicelnfo that has not yet been used. If so, setup 
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that service, make it the current service in the 
Servicelnfo object, and return back the same 
Servicelnfo object. 


3. Otherwise, save the timestamp and the current 
blacklist. The timestamp is important in later 
steps. The blacklist is needed because more ser- 
vices may be available than those returned dur- 
ing the previous directory lookup(s). That is, 
the blacklist transcends the partitioning done 
by the lookup; It is used to capture the AFEs 
access to ALL relevant services. 


4, Request the directory client to do a new lookup, 
passing it the blacklist, and wrapping the return 
list in a new Servicelnfo object. 


5. If the ServiceInfo object is valid, then there were 
other services as yet unused by the AFE that 
met its requirements. Copy back the timestamp 
and the blacklist. Return the Servicelnfo object 
to the AFE. 


6. If the ServicelInfo is invalid one of two scenarios 
could have happened. First, the current sweep 
through all possible services may have finished. 
If so, a new sweep is started by clearing the 
blacklist and creating a new timestamp. Sec- 
ond, there may be no more services available, 
even though they may still be registered with 
the directory service. One way for this scenario 
to happen is if there is a network partition that 
does not affect access to the Metis Directory 
Service but that interferes with access to the 
servers that the services are running on. It is 
detected by noting that the time required to 
finish a sweep is less than a pre-configured time 
determined by the system administrator. 


7. If no new services are available, VEM repeats 
steps 4 through 6 one more time. If, even after 
the retry, it cannot satisfy the request then it 
displays an error message. 


4.3. ControllerInterface Implementation 


Controllers that implement the ControllerInterface 
provide AFEs with access to system services. In the 
current version, two types of controller classes are 
available: PrintController and DataController. The 
PrintController provides access to printers while the 
DataController provides access to data stores. Note 
that there is only one instance of the Controller for 
each system service. The interface consists of the 
method below. 


public Object getController(int type) { 
switch (type) { 
case ControllerInterface.PRINTER: 
return printController ; 
case ControllerInterface.DATASTORE: 
return DataController; 
default: 
return null; 


4.4 DirectoryInterface Implementation 


The DirectoryInterface need only define the follow- 
ing method: 


public SearchResult 
lookUpService(Authobj authobj, 
String name, String filter, 
URL[] blacklist, int howmany, 
String attrlist) 
throws jJava.rmi.RemoteException; 


The requestService() method of the VEMInterface 
calls this method to access the LDAP directory. The 
parameters passed to this method are: 


authobj: The authentication object from the 
workspace used for access control in the LDAP 
directory. 


name: The name of the service being looked for. 


filter: A filter composed of attribute/value 
pairs. 
blacklist: A list of service locations that the 


client specifically does not want returned even 
if matched. 


howmany: A count of matches to be returned. 


attrlist: A list of attributes that are to be re- 
turned with every match. 


The method returns an object containing service 
locations and attributes. The implementation of 
the lookUpService() method on the Metis Direc- 
tory Server is straightforward. The Metis Directory 
Server acts an LDAP client. It constructs an LDAP 
query, and submits the query to an LDAP server. If 
desired, a random subset is chosen from the services 
returned by the LDAP server and are returned to 
the client. At the present time, the Metis Directory 
Server uses an LDAP client API. The use of the Java 
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Naming and Directory Interface (JNDI) [2] is being 
investigated. 


The interface is simple but flexible enough to be used 
for various purposes. In addition to looking for ser- 
vice names and services with specific attributes, it 
can also return a list of all services (subject to access 
control) when used as: 


SearchResult s=lookUpService(authobj, 
null, null, null, 
ALL_POSSIBLE, null) ; 


where the null values indicate no filtering and sub- 
setting is to be performed; or, can return the count of 
all services with the name “LotusNotes” when used 
as: 


int i=lookUpService(authobj, 
“"LotusNotes", null, null, 
ALL_POSSIBLE, null).getCount() ; 


4.5 AFE Implementations 


A number of applications were implemented to 
demonstrate the usefulness of the framework, includ- 
ing an application (called ViewGlass) that can ac- 
cess Lotus Notes servers and a financial application 
suite. In this section, the AFE/service partitions are 
described as well as AFE use of the VEMInterface 
to access its service for the ViewGlass application. 


4.5.1 ViewGlass Implementation Notes 


ViewGlass provides a user with the ability to access 
her Notes mailbox (to send, read, and delete mes- 
sages), and to read discussion databases. The func- 
tional split for this application is that the AFE, writ- 
ten in Java, would contain the GUI, rich text brows- 
ing support, and a private communication layer that 
directly accesses a proxy service. The service, writ- 
ten in ‘C’, would accept messages from the GUI, call 
the appropriate NotesAPI functions, and return in- 
formation back to the GUI. The service runs as a 
daemon and contains client-specific state. 


During the ViewGlass initialization process, the re- 
questService() method is called to get the location 
of a known Lotus Notes service. The location of the 
service is then accessed and a socket connection is 
made. This is shown in the code below. 


try { 
sinfo = workspace.requestService( 
(Object)this, "LotusNotes",nul1) ; 
} catch (Exception e) { 
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System.out.printin( 
"LotusNotes service not found"); 


destroy(); 
return; 
J 
URL url = sinfo.getURL(); 


server_name = url.getHost(); 
server_port = url.getPort(); 


Connection failures are notified to the AFE via the 
exception mechanism. The AFE calls the reset() 
method to recover from failure. 


public void reset() { 

try { 

sinfo = workspace.recoverService( 
(Object)this,sinfo) ; 

} catch (Exception e) { 
System. out .println( 
"LotusNotes service not recovered") ; 
proxy_recv = null; 


destroy(); 
return; 
} 
URL url = sinfo.getURL(); 


String server_name = url.getHost(); 
int server_port = url.getPort(); 

// ... ™make the connection 
wk_sp_frame.recover() ; 


Note that the reset() method ends with a call to the 
recover() method. Since some client state is main- 
tained in the service, full recovery is not possible 
from just the client. However, the state completely 
contained in the client is recovered. 


5 Related Work 


During the last year, many players in the computer 
industry have focused attention on alternatives to 
the traditional client. One of the design points for 
most of the efforts is to ensure that the users con- 
tinue to have access to all resources that they are 
accustomed to. One well-investigated system is to 
provide access to applications using a browser. In 
this system, the application runs mostly on well- 
known servers. HTML pages, some enhanced with 
small Java applets, are sent back to the client. The 
browser relies on an underlying operating system 
to get access to files and printers. The browser 
metaphor works well if there are only a few applica- 
tions that do not interact much and if each applica- 
tion has a restricted amount of user interaction. 
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The browser system is one possible choice for en- 
abling the use of Network Computers as the hard- 
ware client for thin-client applications. However, for 
environments where an application is composed of 
many sub-applications that may interchange data 
frequently, the browser metaphor is not sufficient. 
The browser metaphor also lacks the integration of 
network-based access to system resources. Other al- 
ternatives like defining Network Computer desktops 
or webtops (i.e., the Lotus DeskTop and HotJava 
Views) have been investigated as extensions to the 
browser metaphor. Also, many vendors are vying 
to provide environments for Network Computers to 
access system resources commonly found on tradi- 
tional desktops, like printing and file access [10]. 


During the Metis effort, we emphasized enabling 
commercial applications for Network Computers. 
Many of these business applications are currently 
single-system based. Moving to a network (1) in- 
troduces unreliability not often found with stand- 
alone systems, (2) raises security concerns, and (3) 
distributes resources like printers and files. While 
the browsers and Network Computer desktops could 
handle the latter two issues in future implementa- 
tions, they are not meant to address the first. Both 
the Lotus DeskTop and HotJava Views could be in- 
tegrated with Metis by replacing the workspace used 
in the reference implementation, to address all three 
issues today. Metis would then also provide a frame- 
work for developers to implement robust applica- 
tions for browsers and Network Computer desktops. 


The previous paragraphs focused on browsers and 
desktops that provide access to complex applications 
for Network Computers. However, Metis provides a 
distributed application technology as well as a user 
system. There are several distributed application 
technologies for traditional clients and servers that 
some developers could use for Network Computers. 
These technologies include CORBA and design pat- 


terns. 


CORBA [11, 12, 13], is a distributed applica- 
tion technology specified by OMG that emphasizes 
reusable services and facilities. These specifications 
allow applications to interact with other applica- 
tion modules independent of the machine architec- 
ture and language the modules have been written 
in. OMG’s Trading service specification allows an 
application to query and identify service names that 
match a particular criteria. These service names are 
then bound to a particular object as per the Nam- 
ing service specification. The combination of the two 
services can be used by applications under CORBA 


to achieve late binding of a name to an object. Metis 
provides similar late binding of service providers to 
an AFE using an LDAP directory server. AFEs can 
specify service properties through the Filter class. 
Results of the match are available to the AFEs for 
binding and use. 


Design patterns [14, 15] have been proposed as a 
technique for application development that also em- 
phasize reuse of software architectures, including 
those for distributed systems. Design patterns al- 
low software developers to write their applications 
using high-level models that are independent of lan- 
guage and machine architecture. The patterns focus 
on key components and their interaction to facili- 
tate reuse of software. Design patterns have been 
used for writing large scale commercial applications. 
The Metis workspace has been written as a design 
pattern for desktops on thin clients. The workspace 
provides components for locating and binding to ser- 
vices, access to system services, and security compo- 
nents. The intent is to allow application developers 
to use the workspace pattern in developing applica- 
tion suites for thin clients. 


6 Conclusions 


In this paper we presented a thin-client program- 
ming model where clients download application front 
ends that have a presentation layer and some ap- 
plication logic, but the bulk of an application is 
executed as services on remote servers. We de- 
scribed the design and implementation of a frame- 
work, called Metis, that enables the thin-client pro- 
gramming model, and showed how it can be used in 
sample applications. 


The design of the Metis framework has an open ar- 
chitecture composed of abstract interfaces for var- 
lous services so that any implementation can be 
plugged in. We implemented both client- and server- 
side infrastructure and realized a full end-to-end 
framework that provides support for: 


finding and binding services 
access control and authentication 
system services 

code services 


Metis support for late binding makes it possible to 
write reliable, flexible, and manageable thin-client 
applications. Moreover, the Metis framework pro- 
vides true platform independence beyond the lan- 
guage level by virtualizing all system resources as 
services. A demonstration of the reference Metis 
implementation that includes the Metis classes and 
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API documentation is available at the IBM alpha- 
Works web site (http://www.alphaworks.ibm.com) 
as TCAF. 


During the course of this work we identified sev- 
eral areas of further research which may be bene- 
ficial to the thin-client programming model. These 
include support for application-specific recovery, re- 
mote event mechanisms,and improved security and 
communications. We plan to explore the above ar- 
eas as we enhance Metis to fully realize the benefits 
of the thin-client programming model. 
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Abstract 


Vendors cannot provide all the operating system 
services that users demand. As a result, there has 
been a persistent desire to make operating systems 
more flexible and customizable. It is natural that 
object-oriented technology would come to bear on 
this area. However, many solutions have been disap- 
pointing when it comes to ease of use. 

This paper describes the design and implementa- 
tion of Frigate, an object-oriented file system. The 
goal of Frigate is to provide a modular, extensible 
framework. ‘The framework allows new extensions 
to be “plugged-in” on the fly. Frigate’s focus differs 
from most other file system designs in that it is tar- 
geted for use by ordinary users rather than by soph- 
isticated operating system gurus. Thus, ease of use 
is a very important concern in the design. Frigate is 
fully implemented and supports a set of example file 
system extensions. 


1 Introduction 


Vendors cannot provide all the operating system 
services that users demand. On the one hand, it 
is economically infeasible, and on the other hand, 
many times they do not know what the users de- 
sire. Vendors rightly concentrate on providing afew 
general purpose services. Users can either lobby the 
vendor to include some desired service or implement 
it themselves. Adding services to the operating sys- 
tem is a difficult proposition. Traditionally, special 
privilege and access to source (and possibly a license 
agreement) is required. Programming the kernel 
can demand special care and knowledge. Debugging 
modified operating systems is difficult as well usu- 
ally requiring specialized kernel address space tools 
as well as continual rebooting. Convincing someone 
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else that your extension of the operating system is 
not too risky is not always easy. Distribution of 
modifications may encounter similar trust problems 
and be limited by license restrictions. Compatib- 
ility between independently written extensions and 
the ability to further modify them are also potential 
problems. 


The long standing desire to make operating sys- 
tems more flexible and customizable led to various 
architectural innovations, including loadable device 
drivers, streams [42], vnodes [28] and micro-kernels. 
All of these ideas were aimed, at least in part, at 
providing extensibility. It is natural that object- 
oriented technology, with its themes of modular ex- 
tensibility, would eventually come to be applied to 
this problem. 


Most attempts at applying object-oriented techno- 
logy to operating systems are attempts to internally 
restructure the operating system into a more mod- 
ular organization. The object-oriented model is not 
generally exported to the users. The intended audi- 
ence of such solutions are expert operating system 
architects, who are conversant in the intricate intern- 
als of the operating system. Such tools are extremely 
hard for ordinary users to use. 


Our particular problem domain is the file system. 
Within this domain, Frigate takes a different ap- 
proach. The intended audience of Frigate is ordin- 
ary users. Frigate’s object model is not just for in- 
ternal use. It 1s fully exposed and usable by ordinary 
users. A programmer using Frigate needs to be fa- 
miliar with object-oriented concepts and system-call 
programming but does not need to be an operating 
system guru. We believe we have constructed tools 
that can be easily and widely used. This enables a 
much larger audience to do powerful, modular exten- 
sions of the file system. 


Our overall goal is to provide a way to add value to 
the file system easily. Towards this end, Frigate at- 
tempts to provide a modular, easy-to-use, persistent 
object framework that also allows incremental usage 
and is fully compatible and integrated with the cur- 
rent filing environment. 
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2 Architecture 


The Frigate system consists of five main com- 
ponents built on top of UNIX. At the lowest level, 
Frigate uses typed files. File System extensions are 
stored in an external Repository. When extensions 
are used, they are instantiated as server processes. 
Within the operating system, Frigate provides a Dts- 
patcher module, which manages the servers and in- 
tercepts file system calls, passing them out to the 
servers. This module is built on the Stackable Lay- 
ers framework. Finally, an object-oriented program- 
ming interface is provided by using Xerox PARC’s 
Inter-Language Unification (ILU) system [25]. The 
runtime architecture of Frigate is illustrated in Fig- 
ure l. 


2.1 Stackable Layers Infrastructure 


Frigate marries two different worlds together; 
it connects a UNIX file system model with a 
CORBA [36] based object model. On the UNIX 
side, Frigate uses the Stackable Layers framework, 
which can be seen as a generalization of the Vir- 
tual File System (VFS) [28]. VFS is the basis for 
most modern UNIX file system implementations. In 
VFS, the file system portion of the operating sys- 
tem is divided into a generic portion and specific 
file system implementations (e.g., Unix File System 
(UFS), NFS, PC-FS). All files (directories, pipes, 
etc.) are represented in VFS by an abstract type 
called a vnode (virtual node). All calls into the spe- 
cific implementations are operations on this abstract 
type. This allows other specific implementations to 
be added without any modification of the generic 
code. In practice, though, this is difficult as each 
specific implementation is still quite large and must 
be developed and debugged in the operating system 
address space. 

A further refinement of VFS is the UCLA Stack- 
able Layers file system [23]. In this model, each spe- 
cific implementation Is made up of a stack of protocol 
layers in the kernel similar to System V streams [42]. 
Each layer is a much smaller entity than the large 
monolithic file system implementations of pure VFS. 
Generally, layers implement a specific increment of 
file system features. Operations not implemented 
by a particular layer are passed down to lower lay- 
ers in the stack, via the “bypass” operation (a form 
of delegation). The layers use the vnode interface 
for inter-layer calls and can be independently de- 
veloped, replaced and composed together. For ex- 
ample, a file system might be composed of a layer 
for naming/directory services and a layer for storage. 
Later, the storage layer could be replaced with one 
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Figure 1: Frigate Architecture 


implementing a log-structured storage [43] without 
requiring any changes to the naming layer. Stackable 
Layers was used to implement the Ficus replicated 
file system [18, 19]. Layers need not be configured 
strictly linearly as stacks but can also be placed in 
tree configurations. As we will see, Frigate exten- 
sions are implemented in this model as layers. 


2.2 Documents 


Frigate uses typed file entities, which we call doc- 
uments. A file entity is anything that can be named 
in the file system: file, directory, pipe, etc. Along 
with the standard UNIX file attributes, each doc- 
ument has associated class and version identifiers. 
The class is a name for a particular method interface 
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and hence a set of services provided by the docu- 
ment. The version is used with the class to pick a 
particular implementation at bind time. To prevent 
name clashes, but at the same time, permit inde- 
pendent development, the actual identifiers are NCA 
UUIDs [52], which are globally unique but can be 
generated in a distributed fashion. This obviates 
the need for any central registry. In our current 
implementation, the storage for these attributes is 
provided for in a separate Stackable Layer [50]. 


Documents are objects in our system. However, 
they have some distinctive features. In our model 
there is a standard implicit UNIX file class, which 
all document classes inherit from directly or indir- 
ectly. Vnode operations appear in the Frigate model 
as methods. Document classes can declare that their 
associated versions have modified vnode method im- 
plementations. This redefinition of vnode methods 
is possible for all file operations, except for a few 
reserved for Frigate’s infrastructure. Of course, doc- 
ument classes can add any other methods also re- 
quired for their application usage. Because docu- 
ments have an associated UNIX file entity, they can 
be named by using file pathnames and can also use 
file data storage. 


2.3 Repository 


The repository stores the interfaces and imple- 
mentations for Frigate extensions. Each class has 
an interface description indexed by class identifier. 
The interface description consists of identifiers for 
vnode methods that are redefined by the document 
class. This information is used by the Dispatcher to 
determine if a vnode operation should be intercepted 
and passed to an extension server. The Dispatcher 
also sends vnode operations carrying ILU method in- 
vocations to the servers. 

Implementations stored in the repository are the 
actual servers that implement objects. ‘The imple- 
mentations are indexed by class and version. One 
implementation for each class may be marked as pre- 
ferred. Applications can specifically request the pre- 
ferred implementation. A preferred implementation 
is usually the latest or most stable version. Stored 
with the implementation is load balancing informa- 
tion and configuration parameters for the server. 

The repository also supports alzas records. An 
alias is an alternate name for an implementation or 
another alias record. It defines a mapping from one 
implementation identifier to another identifier. As 
described later, this feature is used to aid in version 
management. 


The repository is implemented by a repository 


server and database. Requests from both the Dis- 
patcher and front-end user programs are handled by 
the server. ‘The repository server handles various 
requests from the Dispatcher involving server man- 
agement, interface query and binding of implement- 
ations. The process of binding involves mapping a 
class/version pair to a specific server implementa- 
tion, taking into account preferred implementations 
and alias records. 


Front-end user programs interact with the repos- 
itory server mainly to add and remove class descrip- 
tions, server implementations and alias records. Se- 
curity features only allow the class owner to manipu- 
late the class description and associated implementa- 
tions. Otherwise, all users are capable of adding and 
managing classes and implementations. 


2.4 Dispatcher 


The Dispatcher is a stackable layer that stacks in 
the operating system above the layers that provide 
file storage and Frigate attributes. This layer inter- 
cepts vnode operations for documents and directs the 
management of the servers that implement document 
objects. From the view of Stackable Layers, this is 
just another layer. The Dispatcher layer, however, 
does not have the same behavior for all files. Instead, 
the behavior of the Dispatcher varies according to the 
class and version of the document. 


When the Dispatcher vnode for a newly referenced 
file is constructed, some special actions occur. First, 
it is determined whether the file is a document or 
not. A document has Frigate attributes; other file 
entities do not. If the file entity is not a document, 
the vnode is set up to simply bypass all operations 
to the layer beneath it. Thus, ordinary files behave 
as if the Dispatcher layer did not exist at all. 


If a vnode belongs to a document, then the class 
attribute is used to find its interface by querying the 
repository server. (Information is cached, so sub- 
sequent queries may avoid RPC with the repository 
server.) The interface information from the repos- 
itory is used to determine which vnode operations 
should be intercepted and sent to the server. All 
other operations bypass to the layer beneath. 

Servers are lazily started; no attempt is made to 
start one until some method or vnode method is 
called. In this way, no unnecessary server starts 
are performed. When an operation that requires a 
server is invoked, the document is actually bound 
to a particular implementation. This late binding 
allows the most recent repository changes to be re- 
flected in the binding, even if these changes occurred 
since the time of class interface lookup. When the 
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actual implementation is determined, the list of act- 
ive servers is checked for a matching server. If a 
matching implementation server has spare capacity, 
then the document is assigned to that server. If no 
server is available, then one is started. 


After a document has been assigned to a server, 
methods and vnode methods are passed to the server 
using Stackable Layer transports. Methods pass 
through special vnode operations that carry [LU in- 
formation to and from the server. A server may 
handle requests from several different users. For 
the duration of each request, the server’s credentials 
are changed to that of the process that invoked the 
method. Thus, the server will have exactly the priv- 
ileges of the client. Between requests, the server is 
given special reduced privileges. If we did not have 
such a mechanism, it would not be possible to have 
a single, possibly untrusted, server handle multiple 
users. Each user would have to have his own server 
to assure correct access credentials. 


When a document is no longer referenced, the 
vnode for the document is destroyed. During this 
process, a message is sent to inform the server. 
When a server no longer has any documents assigned 
to it, the Dispatcher directs the repository server to 
terminate the server. 


2.5 Servers 


Stackable Layers provides some facilities to run 
user-level file system servers. Frigate servers are 
modified and enhanced versions of the Stackable 
Layer servers. Frigate servers are structured some- 
what differently to allow modular construction from 
ILU files and to accommodate method dispatch at 
runtime. Since Frigate servers are user-level, server 
programmers do not have to worry about kernel pro- 
gramming restrictions. Frigate provides the means 
for running more than one implementation side by 
side for any particular class. Also, multiple servers 
can run for load balancing. 


Documents are persistent objects, which can be 
in one of two states: active and passive. An active 
document has been assigned to a server and is ready 
to service requests (i.e., has a running implementa- 
tion). A passive document has no server. Whenever 
a document is newly assigned to a server, methods 
are automatically invoked on the document to activ- 
ate it. Typically, these methods initialize the docu- 
ment’s implementation from its persistent file data. 
When the last reference to a document is removed, 
methods are automatically invoked to deactivate the 
object. Usually, these methods force any pending 
updates out to persistent storage. 
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The actual implementation of a method lies in its 
handler code. If necessary, the handler may call on 
services from layers lower down in the stack. Calling 
on lower layers leads back into the operating system 
rejoining with the in-kernel stack. This is accom- 
plished by using a special Stackable Layers trans- 
port. 


2.6 Programming Interface 


The Inter-Language Unification (ILU) system [25] 
from Xerox PARC is used by Frigate to provide 
an object-oriented programming framework. ILU 
provides a CORBA [36] based object model. In 
ILU, type definitions and interface definitions for ob- 
jects (including multiple inheritance relationships) 
are written in CORBA IDL (or ILU’s own ISL). 
These definitions are compiled into target language 
mappings, providing client calling stubs and server 
method skeletons. The target language mappings 
are compiled along with application code to con- 
struct the actual ILU client and server programs. 
At runtime, ILU clients and servers communicate by 
using transports selected from ILU’s library. 


Frigate integrates document objects and vnode 
methods with ILU. Currently, Frigate implements 
support for documents in both IDL and ISL and 
provides support for the ANSI C target language. 
Frigate applications are programmed Just like other 
ILU applications, except that servers must be as- 
signed a version number and registered with the re- 
pository to be used. Compiling interface definitions 
automatically produce the required calling stubs and 
skeletons for any vnode methods. Server method 
handlers may be programmed using the Stackable 
Layers vnode model or the regular UNIX system call 
model. 


Frigate document classes appear in the ILU world 
as ILU object types. They are distinguished from 
other ILU objects by inheriting object types from the 
implicitly defined UNIX file interface. This avoids 
any syntactic change to ISL or IDL. Only documents 
may have vnode methods. The semantics of inherited 
vnode methods, though, are different from those of 
other inherited methods. Even if a particular vnode 
method is not inherited, a document will still try to 
delegate that method invocation by using the Stack- 
able Layers bypass mechanism. Vnode methods are, 
by design, inherited only. The signatures of vnode 
methods are externally fixed and so cannot be re- 
defined inconsistently with the rest of the framework. 

At runtime, Frigate documents are located by us- 
ing file system paths. The file system namespace 
acts as the published object registry. Through the 


USENIX Association 


USENIX Association 


use of Frigate utility routines, the file system name 
of a document is resolved internally into an object 
handle and then into a local proxy object. ‘There- 
after, messages for ILU methods or vnode methods 
can be sent to the document object by using the call- 
ing stubs defined in the language mapping. Method 
invocations made on proxy objects are automatically 
converted into calls to the Frigate document serv- 
ers. The calls are transported to Frigate servers via 
the Frigate transport, which packages the Stackable 
Layers infrastructure as an ILU transport. 


It should be noted that vnode methods can be in- 
voked both through ILU and through ordinary UNIX 
system calls. UNIX system calls are serviced by the 
VFS generic file system code, which in turn makes 
vnode operation calls. These calls are selectively 
intercepted by the Dispatcher just as direct vnode 
method calls would be. This means that Frigate doc- 
ument servers can even enhance the behavior of pro- 
grams that have no knowledge of the Frigate object 
model by enhancing standard UNIX file operations. 
Thus, almost all existing UNIX programs can trans- 
parently use Frigate. 


2.7 Versions 


The mechanisms provided by Frigate allow for 
multiple models for version management. In all 
cases, the repository accommodates installation of 
classes and implementations without any disruption 
of system operation. Currently active documents will 
continue to execute with their bound implementa- 
tions; any new use will immediately reflect any re- 
pository updates. Multiple implementations for the 
same class can run side by side. New implementa- 
tions can be installed and tested in isolation before 
general use by proper version management. 


Documents marked with preferred versions always 
use the implementation so marked. This feature is 
used to support an “eager” model of version manage- 
ment. In this model, the newest accepted version is 
designated as preferred. In this way, documents can 
automatically use the most recent version. New ver- 
sions, though, should not be designated as preferred 
until tested sufficiently. 


A more selective model uses aliases to effect the 
update. After acceptance of a new version, the old 
version is replaced by an alias that maps to the new 
version. Any documents with the old version iden- 
tifier automatically use the new version. ‘There is 
no need to hunt down existing documents and up- 
date their version identifiers. As documents are used, 
they may have their version identifiers rolled forward. 
However, it is not necessary for all old versions to 


be replaced by aliases. Situations where there are in- 
compatible changes in data formats or limited trust 
placed in a new version can be handled. This is a 
“lazy” model where no change is made until an old 
version is explicitly mapped to the new version. It is 
also possible to have chains of aliases that converge 
to allow future coalescing of versions. 

In many object-oriented systems, subclasses are 
often defined to provide a specialized implementation 
even when there is no interface specialization. For 
example, an image object class may have subclasses 
for GIF and TIFF formats. Since implementation in 
Frigate is completely divorced from class definition, 
the version system can allow specialized implementa- 
tions wzthout subclassing. No duplication of interface 
is needed. Separate specialized implementations can 
be installed under the same class. Thus, two dif- 
ferent versions do not necessarily represent different 
stages along a single line of development. They may, 
in fact, be parallel separate lines of implementation. 
In Frigate, there may be GIF and TIFF implement- 
ations for the same image class. 

The version system can, in effect, provide a form 
of polymorphism. ‘The advantage of doing this is 
that simple repository commands can give the user 
complete control over the binding process. By us- 
ing aliases, the evolution along any particular line of 
development can still be transparently managed. If 
future implementations merge lines of development, 
aliases chains can be made to converge as well. 


3 Examples 


Frigate has a variety of possible uses. Here we 
outline some example possible applications of Frig- 
ate to illustrate the flexibility of the framework. We 
have actually implemented a transparent encrypted 
file system, image files, intentional files and a simple 
access control system. 

Frigate can be used to provide file system features 
such as transparent compression, encryption and mi- 
gration. For the most part, these are services that 
continue to use the ordinary system call interface. In 
Frigate, such services are provided transparently and 
automatically by redefining vnode methods. The new 
definitions for vnode operation enhance the behavior 
of standard system calls. In this way, no change to 
old programs is necessary to take advantage of en- 
hanced behavior. 

Another example of this type is the zntentzonal file. 
Instead of explicitly storing the contents of a file, the 
contents are instantiated on the fly. Traditionally, 
UNIX stores information on users, hosts and net- 
works in a series of files. Sun’s NIS service modifies 
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the standard library calls to access network servers 
instead of the files. Frigate can provide the same sort 
of service through intentional files. The file contents 
are generated on demand based on information from 
the network servers. ‘The advantage here 1s that even 
programs that have the old library code can benefit. 
Dynamic information, such as network loads, logs or 
realtime data can similarly be provided by an in- 
tentional file interface. Intentional files can also be 
used as a form of compression for large but easily 
regenerated datasets. 

Other applications might use a mode of opera- 
tion that is not strictly transparent. In such cases, 
the interface is extended to provide additional func- 
tions. An example is an enhanced file access con- 
trol system. In such a system, there are additional 
methods to manage access control lists, provide dif- 
ferent levels of information to users, read logs, etc. 
However, the majority of the file system access (e.g., 
read/write) still goes through the ordinary system 
call interface. 

Frigate also provides an object-oriented model of 
access. Programs using this model can take advant- 
age of polymorphism. A single document class for 
image files might have several implementations, each 
of which services a different image format. A single 
program could be written to manipulate any of the 
image files via the document interface. The program 
would not have to be changed, if available formats 
changed or if a format was added after the program 
was written. This is because the implementations 
(servers) are completely separated from the inter- 
face (class definitions) and from the client program 
binary. Despite the generic nature of the program, 
format specific code is executed for each particular 
file for efficient access. 

The UNIX file interface is limited to addressing file 
entities. Frigate’s object-oriented model can address 
other sorts of objects. Messages may be sent directly 
to parts of a compound document represented by ob- 
jects. This allows compound documents similar to 
those in OLE [35] or OpenDoc [14] to be construc- 
ted in Frigate. In this case, a document object is a 
container for other “content” objects, which contain 
text, images, audio, spreadsheet cells, graphs, etc. 

Frigate’s facilities can be used to provide lznks. 
Links are potentially interfile, persistent references 
to content objects that can be stored in documents. 
(ILU object handles are not persistent.) With a link 
tracking service, updates of a source can automat- 
ically be reflected in any document containing link 
references. Frigate can also be used to provide more 
general models of interfile consistency. Changes in 
one document can be automatically propagated, ap- 
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pended, merged, etc. into other documents. 

UNIX file operations are low level. Frigate can 
provide much higher level file system abstractions. 
Documents might have operations to print, archive, 
install and move (with all other associated objects). 
Operations can be tailored to be more meaningful to 
the application. Documents are capable of any ac- 
tion that can be accomplished by a program, because 
documents are implemented by server programs. For 
example, adocument might automatically enforce i1n- 
tegrity constraints or warn other users of important 
changes by email. 


4 Extensibility 


Our overall goal of making the file system easier 
to extend has several aspects. We provide for ex- 
tensions that are not resident in the operating sys- 
tem address space. ‘This frees us from the classic 
problems of operating system development including: 
privileged access, source access, specialized operat- 
ing system knowledge, programming and debugging 
restrictions, rebooting, potentially disastrous effects 
of bugs and redistribution restrictions. 

Frigate is designed to scale. Frigate class and ver- 
sion identifiers are assigned in a distributed fash- 
ion. Extensions are not limited by having to all 
fit into a single address or overlay space. Practic- 
ally speaking, Frigate is limited by runtime resources 
(i.e., number of simultaneous running servers). 

Frigate provides an object-oriented interface to 
the file system. This provides a single coherent cli- 
ent interface paradigm, rather than a hodgepodge 
of ad hoc interfaces. Variants are cleanly described 
through interface inheritance. The object interface 
also provides the ability to use polymorphism with 
file operations. Frigate’s late binding mechanism al- 
lows implementation to be changed right up until a 
server is required. Since the actual implementation 
is not part of the client program, no recoding, re- 
compilation or relinking 1s ever necessary to make 
full use of polymorphism. 

The class provider is given a structured envir- 
onment that merges vnode and ILU method pro- 
gramming into a common framework. Frigate re- 
mains compatible with Stackable Layers but with 
a better programming interface. ‘The familiar file 
descriptor programming model is also available to 
program methods. 

True encapsulation that cannot be bypassed, ac- 
cidently or purposefully, is provided around file en- 
tities. Redefined vnode operations can enhance the 
behavior of old programs without any change to the 
program. 
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Frigate provides a mechanism for flexible version 
management. When new versions are installed, there 
is no need to reboot or restart the system. Multiple 
implementations can run side by side with no inter- 
ference. New runtime instances are automatically 
assigned the updated versions. There is no need to 
hunt down every document instance to update it. 

Except for initial installation of the framework, all 
facilities can be used with full capabilities by un- 
privileged users. Anyone can write, install, debug 
and use extensions to the file system. 


5 Safety and Security 


Frigate extensions are placed in server processes 
outside of the operating system address space. The 
process boundary around each server acts as a fire- 
wall. Buggy or malicious behavior is confined to the 
server process. Extensions communicate with the op- 
erating system through Stackable Layers transport 
layer interfaces and via system calls. Extensions do 
not have direct access to the operating system ad- 
dress space. Thus, the operating system is protected 
from the server just as well as it is protected from 
any other user process. 

During the servicing of a method, the server holds 
the same access rights as the calling process. In 
between method calls, the access rights are changed 
to that of a special unprivileged Frigate user. The 
rights granted during method service are identical to 
that given to a program executed by a user. The 
difference is that access rights are being granted on 
a lower level of granularity and access rights change 
over the life of the server. It is not possible to use 
the varying credentials to accumulate access rights. 
Even if access is granted previously, the next access 
may be barred on the basis of the submitted cre- 
dentials. Overall, malicious extension code cannot 
do anything not possible through ordinary malicious 
program execution. 

However, Frigate does offer more danger as a Tro- 
jan horse. Precisely because Frigate can operate 
transparently, it is somewhat easier to unknowingly 
run an untrusted Frigate extension than to execute 
an untrusted program. This is especially true if a 
user defines a malicious vnode method. Frigate at- 
tempts to mitigate the problem in two ways. First, 
privileged (root) clients do not pass their rights to 
Frigate servers. Thus, the only actions that a Frig- 
ate server can perform for a root user are those that 
any non-privileged program could. Root users can 
always disable Frigate, if necessary. In addition, it 
is always possible for users to examine an accessible 
file entity for Frigate attributes. (Frigate does not 


allow the relevant vnode operations to be redefined. ) 
In this way, a user can safely determine if a suspi- 
cious file entity is an untrusted Frigate document by 
examining its attributes. 

The Frigate repository is shielded from unauthor- 
ized modification by file protections. Requests typic- 
ally issued by the operating system are only accepted 
from protected ports. Only class owners can modify 
a document class and its implementations. 

Frigate offers complete encapsulation of docu- 
ments. Since Frigate is a part of the operating system 
and below the system call interface, 1t cannot be by- 
passed accidently or maliciously. Frigate allows any 
specified operation to be intercepted, except those 
used by Frigate itself. This ensures that security or 
data integrity mechanisms will not be bypassed. Ma- 
licious users cannot defeat encapsulation by altering 
the attributes on documents. Only the owner or the 
root user can change the attributes on a document. 


6 Compatibility 


Frigate remains compatible with the vnode pro- 
gramming model. As previously described, some 
Frigate extensions provide their new functionality ex- 
clusively through vnode methods invoked indirectly 
through the UNIX system call interface. As a result, 
this type of extension provides complete backward 
compatibility for existing programs. Old programs 
work “as is” and require no changes at all, not even 
recompilation or relinking, to take advantage of the 
new functionality. This preserves the current soft- 
ware investment. In many cases, this means that 
application programmers do not need to learn new 
paradigms to take advantage of Frigate capabilities. 

Since Frigate is packaged as a Stackable Layer, 
Frigate also offers an inherent form of forward com- 
patibility. The other layers of the stack may be up- 
dated or reconfigured independently of Frigate. This 
means that a Frigate system can take advantage of 
new features in other layers without any change to 
Frigateitself. For example, an enhanced storage sub- 
strate might be provided by substituting a new stor- 
age layer for the standard UFS layer. 

Frigate also offers zncremental use. While Frigate 
offers a new paradigm, one is not forced into an “all 
or nothing” decision to use it. Frigate can coexist 
with standard UNIX. A user need only use Frigate 
as much as is desired. There is no need to conform 
all activity to the new object-oriented environment. 
Porting of existing facilities is not necessary because 
the current environment continues to exist on a Frig- 
ate system. Frigate documents are distinguished by 
class and version attributes. Frigate documents may 
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be stored in any file system that can support those 
attributes. Otherwise, there is no restriction on stor- 
ing ordinary files and Frigate documents in the same 
volume. 

In some cases, it may desirable to port old ap- 
plications to the object-oriented Frigate paradigm. 
For example, the benefits of being able to write gen- 
eric clients, which take advantage of polymorphism, 
may justify the porting cost. In other cases, ap- 
plications may wish to use Frigate’s version man- 
agement system. The incremental nature of Frig- 
ate aids application migration. Migration may be 
accomplished by “wrapping”. Old files are conver- 
ted to documents by wrapping class and version at- 
tributes around them. The addition of the Frigate 
attributes brings them under Frigate management. 
Old programs are “wrapped” as methods by having 
method handlers simply execute the programs dir- 
ectly. In this way, an old application is quickly en- 
abled to run under Frigate. Further changes may, of 
course, require more extensive restructuring. 

Frigate is also compatible with standard lan- 
guages: OMG IDL and ANSI C. All inheritance sup- 
port is confined tothe ILU interface generator. Thus, 
no extensions to C are necessary and the investment 
in C compilers and programming environment can 
be preserved. 


7 Performance 


We measured Frigate’s performance in two re- 
spects. First, we measured the cost of merely hav- 
ing, but not using, Frigate’s framework. We also 
compared the performance of Frigate to alternative 
solutions. 

All of our performance measurements were carried 
out under a SunOS 4.1.1 kernel on SPARC IPC (25 
MHz, 15.8 MIPs) machines with 12 MB of memory 
and a SCSI 207 MB (3.5”, 3600 RPM) disk. The op- 
erating system was modified to use Stackable Layers. 
All test results are derived from 30 runs. 


7.1 Framework Overhead 


We would like Frigate to have minimal perform- 
ance impact when we are not using its facilities. To 
measure this impact, we ran the Modified Andrew 
Benchmark [24, 37] in a file volume with and without 
the Frigate service. 

No direct use was made of Frigate in the bench- 
mark. The configuration without Frigate was simply 
the standard UFS, packaged as a Stackable Layer. 
The configuration with Frigate included a Stackable 
Layer to implement extended (class and version) at- 
tributes and the actual Frigate Dispatcher layer. The 
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Layer Configuration 


UFS 


Dispatcher 
Attribute 
UFS 


stddev | mean _ stddev 


user 
sys 
elapsed 





All runtimes in seconds. 


user 


sys 

elapsed 

Frigate time as percent 
of UFS time. 





Table 1: Non-Use Times 


results for each configuration and each phase of the 
benchmark are shown in Table 1. Frigate times are 
also shown as a percentage of UFS runtime. 

Overall, the overhead of having Frigate, but not 
using it, amounts to an 8% elapsed time penalty. 
Additional measurements were made with just the 
extended attribute layer to see how much of the over- 
head came from that support layer. About seven of 
the eight percent of overall overhead comes from the 
extended attribute support layer. For general use, 
Frigate’s overhead would have to be reduced. The 
obvious strategy would be to improve the perform- 
ance of the attribute service. 


7.2 Comparison to Library 


Our second performance study compared Frigate 
to a user-level library. In many cases, the unpriv- 
ileged user has few alternatives but to implement 
his extension as some type of user-level library. Our 
example extension provided file encryption services. 
The encryption algorithm is an enhanced version of 
the one used in the German World War II Enigma 
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File Size 
(KB) 


Library 
mean stddev 
0.05 
0.23 

26.4 0.18 
295.6 5.80 | 343.2 


All runtimes in seconds. 


Frigate 
mean stddev 


2.0 


File Size 
(KB) 











Frigate time as percent 
of Library time. 


Table 2: Single Client Times 


encryption machine [16]. While not a strong encryp- 
tion method, this example did provide a predictable 
processing overhead on each read and write. 

The library implementation of our encryption ex- 
tension simply followed any read of the encrypted file 
by a function call to decrypt the incoming block of 
data. Writes were preceded by a call to encrypt the 
outgoing data. File operations were performed on 
a UFS file system without any additional stackable 
layers. 

The Frigate implementation redefined the 
read/write vnode method to provide transparent 
encryption and decryption. A new ILU method was 
added to provide the session key to the extension. 
Because Frigate is positioned below the system call 
interface, file data could be buffered and shared 
between clients in the clear. To prevent unauthor- 
ized parties from gaining access to clear text data, 
other vnode methods were also redefined with added 
security features. 

Our first experiment used a single client, which 
provided a mix of file I/O operations and compute 
processing. Each block in the file was read, modi- 
fied, and written back into another location in the 
file. The elapsed time of the library and Frigate im- 
plementations is shown in Table 2. Frigate times as 
a percentage of library runtimes are also shown. 

Overall, we see for small cases (e.g. 8 KB file size) 
that the library performs much better. The reason 
for this result is the server startup time for Frig- 
ate. As file size (and hence overall I/O) increases, 
the server startup time is more effectively amortized 
over the entire run. In both library and Frigate cases, 
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Library 
mean stddev 
0.2 0.05 
5.5 0.22 
52.4 0.97 
556.3 109.77 


All runtimes in seconds. 


Frigate 
mean stddev 


File Size 
(bytes) 
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Percentage 


8 K 
80 K 
800 K 
Frigate time as percent 
of Library time. 





Table 3: Shared File Performance 


the same amount of processing is done by the driv- 
ing client application and in encryption operations. 
The difference in performance comes from the lower 
throughput of Frigate. The additional overheads in 
Frigate include the two additional stackable layers, 
the limited speed of our transport, and the need to 
context switch to the server process to handle re- 
quests. For general use, further performance tuning 
of Frigate is probably necessary. In this case, the 
largest payoffs in improving performance are in im- 
proving our IPC throughput and server startup time. 

Our second experiment involved two clients shar- 
ing access to an encrypted file. One process read, 
modified, and wrote each block of the file. Before 
proceeding to the next block, the second client also 
read, modified, and wrote the same block. The entire 
file was processed 10 times in this fashion. 

The elapsed runtimes for the library and Frigate 
implementations are shown in Table 3. The runtimes 
for Frigate as a percentage of the library runtimes are 
also shown. The file size, which is directly tied to the 
total amount of I/O, was varied in our experiment. 
(The 8000K case was omitted because the library 
solution had an excessive runtime.) 

The server startup cost again makes the library 
implementation faster in the smallest cases. Since 
so little data is actually transferred in the one byte 
file case, it essentially only measures the overhead of 
the framework. However, in the larger cases, Frigate 
does significantly better. The ability to share text in 
the clear reduces runtime substantially. In the largest 
case shown, Frigate runs in less than one-seventh the 
time of the library implementation. The savings is 
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mostly due to avoiding redundant encryption and de- 
cryption operations. Frigate need only perform these 
operations between its buffers and the disk. The lib- 
rary implementation must perform such operations 
on each I/O. This fact more than compensates for 
any I/O throughput reduction suffered by Frigate. 
In this case, Frigate’s architecture provides a unique 
advantage not otherwise available. 


8 Related Work 


Frigate is comparable to a number of different sys- 
tems. In general, we classify these systems according 
to their structure and functionality. A key question 
is where the extension is added to the overall sys- 
tem. File service extensions can be added to client 
processes, system libraries or the operating system. 
If extensions are implemented in servers, then some 
sort of “hook” is inserted into one of these locations 
to intercept the relevant calls. 

Frigate uses servers. The intercept mechanism for 
Frigate is the Dispatcher layer, which intercepts se- 
lected vnode operations inside the operating system. 
Since all file accesses pass through the layer, Frig- 
ate can ensure that an extension has complete con- 
trol over file operations. Frigate’s user-level serv- 
ers allow easy development. The process boundary 
around servers also protects the operating system 
from buggy extensions. The drawback is that per- 
formance can suffer due to the need for IPC between 
the Dispatcher and server. Frigate’s object-oriented 
environment is provided for the client and server by 
using a CORBA based solution. The intercept mech- 
anism in between is essentially ignorant of objects 
other than vnodes. 


8.1 Languages & Libraries 


Programs written in object-oriented languages can 
use objects with support for persistent storage. Typ- 
ically, such objects can provide a high-level method 
interface appropriate to the application instead of the 
primitive system call interface. Support for persist- 
ence may be a builtin language feature. An element- 
ary example is found in the Eiffel programming lan- 
guage [34]. The “storable” class provides methods to 
read and write an internal language representation of 
an object to a file. Other classes gain persistence by 
merely inheriting the storable class. 

When languages have support for persistent ob- 
jects, they can offer a natural, enhanced filing in- 
terface that is fully integrated with the rest of their 
language environment. On the other hand, their rich 
world is not available outside of their own language 
environments. They are not integrated with the op- 
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erating system and cannot provide any encapsulation 
of persistent objects outside of the language envir- 
onment. These systems and others that choose to 
remain above the system call interface can be free of 
the problems inherent to developing in the operating 
system address space. However, by not being able to 
go below the fixed system call interface, their flex- 
ibility and ability to enforce policy is limited. For 
example, it is difficult above the system call inter- 
face to even ensure unambiguous file identity. As 
we saw in the performance tests, it can also be a 
performance liability. 


Most CORBA [86] based object frameworks are 
designed to integrate with target language envir- 
onments. Thus, for the most part, they share the 
characteristics of language-based solutions. Object 
frameworks with “applet” and “serverlet” features 
offer a novel form of modularity. However, when ac- 
tually executing, they too share the characteristics of 
language-based solutions. Some other object frame- 
work cases, such as OLE 2 [35] and ActiveX [8], are 
less clear, since it is difficult to determine how much 
has been subsumed by the operating system. 


Another related approach uses libraries. Com- 
monly used system libraries are modified to enhance 
the behavior of file operations. Often these librar- 
ies are shared and dynamically loaded. This al- 
lows existing compiled programs to use the modi- 
fied library without recompilation. Of course, stat- 
ically linked programs will not be able to take ad- 
vantage of redefined, enhanced operations and will 
break encapsulation. Examples of library systems 
are the 3-D File System [29], COLA [80] and IFS [12]. 
These particular systems provide additional (albeit 
not object-oriented) functionality by intercepting 
library calls to the standard UNIX system calls. 
Like language-based solutions, library systems reside 
above the system call interface and thus inherit the 
associated tradeoffs. 


8.2 Servers 


Another form of operating system extension is the 
server. The server usually executes at user-level out- 
side of the operating system (or at least outside of 
privileged execution modes). The intercept mechan- 
ism may be above or below the system call interface. 
Intercepted calls must be passed to the extension by 
some form of IPC. This may cause some loss in per- 
formance. 

The Object-Oriented File System [47] intercepts 
UNIX system calls at the library level and passes 
them to servers using pipes or UNIX domain sock- 
ets. The modified library must be explicitly linked 
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with clients and thus will only work with new ap- 
plications. Despite its name, it does not offer an 
object-oriented programming interface. There are 
no classes, methods, inheritance, etc. in this system. 

The Distributed Object-Based System 
(DOBS) [11] provides automatically started servers 
that respond to object methods. Communication 
between client and server process is through Sun- 
RPC. This system is similar in flavor to the ILU 
part of Frigate. DOBS defines its own simple 
interface language to describe methods. It does 
not appear to have the ability to manipulate any 
non-file objects, nor is there any inheritance. Also, 
it does not have any integration with the UNIX file 
operations. No existing UNIX file operations can 
be redefined; one can only define new operations. 
As a result, the system cannot be used to affect the 
behavior of existing programs and encapsulation is 
not enforced. 

Some operating systems offer special debugging 
mechanisms to intercept system calls. As long as the 
intercept can be properly set up, complete encapsu- 
lation is possible. Interposition Agents [26] use this 
strategy. An object-oriented toolkit is provided as a 
framework for implementing new services. However, 
the toolkit is not exposed to clients. Instead, each 
agent is provided with a symmetric system call in- 
terface allowing the layering of agents. Thus, clients 
continue to see the same unextended system call in- 
terface. In this system, agent(s) are attached to spe- 
cific clients and are apparently not shared. 

The File Monitor Interface [49] provides a user- 
level server facility that responds to vnode calls in- 
tercepted inside the operating system. The intercept 
mechanism is similar to the Frigate Dispatcher and 
can selectively intercept calls based on special file 
attributes. However, there is no provision for an 
object-oriented programming model. 

Similar facilities were also built for other operating 
systems. Pseudo-File-Systems [51] were developed 
for Sprite [38]. Userfs [15] provides this service for 
LINUX. Stackable Layers also provides this function- 
ality on vnode operations through transport layers. 
These three systems use the mount mechanism to 
provide a volume granularity redirection to servers. 
They also require servers to be started prior to use. 

Watchdogs [4] and Extensible Streams [41] go 
somewhat further in that servers are started auto- 
matically, if none is running. Both also use type 
identifiers similar to Frigate’s class attributes. This 
allows specifying servers on a file level granularity. 

Micro-kernels take the server approach and apply 
it to reorganize the entire operating system. A mod- 
est number of basic abstractions are implemented in 


a small (“micro”) kernel. The bulk of the operating 
system is moved into separate servers. Examples in- 
clude Mach [1] and Chorus [45]. Exokernel [13] and 
Cache Kernel [9] attempt to push this organization 
as far as possible. In a micro-kernel, the file sys- 
tem is usually implemented by one or more servers. 
In the process of reorganization, many micro-kernels 
and their file systems have also acquired an object- 
oriented flavor. An example is Mach 3.0 [17, 46]. 
While this object orientation is of great use in the 
anternal development and specialization of the oper- 
ating system, it is not generally exported to users. 
Users continue to see the same unextended system 
call interface. 


8.3 Extensible Operating Systems 


A number of approaches to increase extensibil- 
ity of the traditional file system have been tried. 
Early attempts to add extensibility interfaces to 
the operating system included streams [42] and dy- 
namically loaded device drivers. The latter is in- 
cluded in systems such as SunOS and Chorus [2]. 
Other attempts to restructure the file system include 
VFS [28] and UCLA Stackable Layers (22, 23], which 
we have already described. An alternative model of 
stacking is described in [44]. The Spring operating 
system (27, 40] also offers stacking with additional 
object-oriented features. 

Recently some new operating systems extend their 
functionality by downloading “safe” modules into the 
operating system. The modules are made safe by 
restricting address references [48] or by using safe 
languages. An example of the latter is the SPIN op- 
erating system [5], which uses Modula-3 [7] as its 
extension language. With the use of the techniques 
mentioned above relative safety can be assured. In 
some cases, though, there is limited space to add ex- 
tension code and thus large extensions are not pos- 
sible. 

Unfortunately, any facility requiring development 
and debugging in the kernel address space needs spe- 
cial tools and privilege. Also, these tools were de- 
signed for expert specialization of the operating sys- 
tem. They are not meant for casual users. However, 
these approaches potentially perform better than 
servers, because there is no need to cross process 
boundaries to reach the extension. 


8.4 Object-Oriented Operating 
Systems 


Another approach is to provide operating systems 
that are built on an object-oriented paradigm from 
the ground up. The services they offer, including 
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the file system, are object-oriented. Extensibility 
is provided by using subclasses and polymorphism. 
This is to be distinguished from merely writing a 
system in an object-oriented language. See [21] for 
further discussion on this point. Of course, the two 
are not mutually exclusive; many such systems are 
actually written in an object-oriented language or 
use one for its client interface. 

Choices [6, 32, 33] implements an object-oriented 
operating system written in C++ with some exten- 
sions. Operating system abstractions, including files, 
are provided as objects in a class hierarchy. Exten- 
sions can be added by subclassing existing classes. 
However, the extension framework is not targeted 
for use by the ordinary user. Rather, it is oriented 
toward controlled specialization of the operating sys- 
tem by privileged expert users. Choices does try to 
address the problems of operating system debugging 
by providing a user-level emulator. 

Clouds [10, 39] offers a complete object-oriented 
operating system. All services in Clouds are offered 
as objects accessed through capabilities. The rad- 
ical persistent object model makes no distinction 
between memory objects and file system. Essen- 
tially, all objects are located in a large distributed 
virtual memory. Objects in Clouds are relatively 
large grain with operating system enforced encap- 
sulation. The COOL [20] system provides a similar 
model built on top of the Chorus micro-kernel. 

True object-oriented operating systems can 
provide powerful extensible environments. Their 
main drawback is their incompatibility with the rest 
of the world. The current investment in software is 
lost. Any desired feature must be ported and pos- 
sibly rewritten to fit with the new paradigm. One 
must make an “all or nothing” switch to the new 
system. Only a few systems such as Solaris MC [3] 
and Frigate explicitly attempt to integrate compat- 
ible use with a new object-oriented operating system 
interface. 


9 Future Work 


Frigate presents a number of directions for future 
work. These concepts are enhancements that can be 
incrementally added to the current, fully operational 
system. 

Frigate is currently implemented on the SunOS 
4.1.1 infrastructure of Stackable Layers. As Stack- 
able Layers is ported to other platforms, Frigate can 
also be ported with only modest effort. On its cur- 
rent platform, the Frigate client environment could 
also be expanded to include other ILU languages 
such as C++ and Modula-3 [7]. 
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Some improvements could also be made to the 
server structure. The most fundamental would be 
to allow object types to be dynamically loadable. 
This would allow implementations to be decom- 
posed into a shared server executable and an ap- 
plication specific set of dynamically loadable imple- 
mentations. Servers would tailor themselves while 
running and load implementations only according to 
demand. Currently, a compound document server 
must have implementations of all possible compon- 
ent classes linked into the executable. In a dynamic 
architecture, compound documents would load only 
what was demanded by the access pattern of their 
constituent components. New classes and versions 
would not require relinking a server executable. In 
effect, we would be extending late binding to server 
binary construction as well. 


Another possible improvement is to allow servers 
to be distributed on other hosts. This would allow 
additional server configuration flexibility and better 
load balancing. To allow this possibility, a new dis- 
tributed alternative to the Stackable Layers user-to- 
kernel transport must be provided along with some 
enhancements to the binding process. 


A number of performance improvements are pos- 
sible as well. Probably, the largest payoffs in per- 
formance are 1n improving the separate attribute ser- 
vice, the communication throughput with the server 
and the server startup time. For servers on the 
same host, considerably better performance is prob- 
ably possible. Current kernel-to-server transports 
are based on implementations of NFS. Much more 
efficient interprocess communication could be used. 
Improvements would probably use strategies similar 
to those used in micro-kernel designs (see [31]). 


Server startup 1s an expensive process involving 
new process creation and process image overlay. 
When in the critical path of execution, this can res- 
ult in a substantial delay. One strategy to improve 
things would be to cache servers. Servers would not 
be killed until a timeout period had expired. Fur- 
ther uses of the same implementation would not re- 
quire a new server to start. Another complement- 
ary strategy would start up a number of dynamically 
loadable servers ahead of time. Unassigned servers 
may even be preloaded with some implementations. 
As servers are needed, they are assigned from the 
pool of servers. In this way, the server startup time 
is avoided, though, the time cost for dynamic loading 
must still be paid. 
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10 Conclusion 


Operating systems and the file systems they con- 
tain have been designed as general purpose instru- 
ments. Specialized file system support for applic- 
ations could provide great benefits. Unfortunately, 
it is very difficult to extend operating systems and 
file systems. What is needed 1s a file system infra- 
structure with extensibility designed in. While some 
extensibility features have appeared, they do not go 
far enough in providing open access for extensibility. 


Frigate attempts to provide such open access to 
all users, not just to those with system privilege or 
specialized training. The philosophy driving Frig- 
ate is that the user or ordinary programmer is best 
equipped to add value to his applications, not some 
distant expert. Therefore, the challenge is to expose 
the power of underlying mechanisms in a safe and 
easy to use way. 


Towards this end, Frigate combines a user-level 
server framework, which is fully integrated with the 
operating system, with an object-oriented interface. 
The user-level structure provides adequate perform- 
ance and, at the same time, frees developers from 
the constraints of development in the operating sys- 
tem address space. It allows the freedom to provide 
an implementation that does not have to mirror the 
structure of a corresponding implementation inside 
the operating system address space. It is expected 
in this environment that untrusted extensions will 
be run. We use the server process boundary as a 
firewall, protecting the operating system and other 
extensions. 


The object-oriented framework provides a coher- 
ent programming model for extension that provides 
the benefits of inheritance, polymorphism and en- 
capsulation. Users do not have to learn a new pro- 
gramming paradigm for each extension. Inheritance 
allows the leveraging of previous work. Polymorph- 
ism allows generic programs to be written that need 
not change as new versions or types appear. Frigate 
also provides full encapsulation that is enforced and 
cannot be bypassed. 


Frigate provides compatibility. By being able to 
redefine vnode operations, Frigate can extend the 
behavior of existing programs. Vnode operations 
are smoothly integrated into the object-oriented pro- 
gramming model. Frigate uses standard languages 
and compilers. The implementation of object- 
oriented mechanisms are carefully isolated so that 
current compilers need not be disturbed. The value 
of the current software investment is preserved. By 
using Stackable Layers technology, Frigate can also 
take advantage of new work packaged as layers. 


Frigate allows incremental use. Frigate objects can 
be freely mixed in the same file system with ordinary 
files. Frigate code can cooperate with other programs 
using the standard system call interface. Frigate does 
not force one to chose between paradigms. Frigate 
can be used as much or as little as desired. It is not 
an “all or nothing” approach. 

Frigate also provides a structure for controlled 
change without any disruption of operation. Not 
only are interfaces named but versions are as well. 
Particular versions can be named and relevant dis- 
tinctions can be maintained. Yet at the same time, 
versions can be transparently upgraded. The install- 
ation and upgrade of versions never causes a need to 
reboot, reinitialize or interrupt operation of the sys- 
tem. 

Frigate scales well. Interfaces and versions can be 
developed independently of each other without fear 
of name clashes. ‘The namespace for interfaces and 
versions is assigned in a distributed fashion without 
the need for a central registry. Practically speak- 
ing, Frigate is only limited by runtime resources for 
server processes and not by arbitrary design limits. 

Other systems have only addressed a few of these 
aspects. In short, we believe Frigate uniquely opens 
the file system to the world. 
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Abstract 


This paper presents uCR, a C++ runtime package 
for embedded program development. We make the 
case that in certain situations embedded program- 
ming is best done without the aid of a conventional 
operating system. A programming environment in 
the form of a C++ runtime is presented, and the 
environment, including the C++ language, is evalu- 
ated for appropriateness. Important factors are code 
size, performance, simplicity and applicability to a 
wide range of embedded targets. 


1 The Problem 


It is common, when building anewly designed board, 
to install only a few components at a time and test 
the partially built board to protect expensive com- 
ponents, to validate portions of a design, or just to 
contain the hardware debugging problems. The first 
time power is applied to a board, often only the 
CPU, memory and ROM socket are installed. Natu- 
rally, software is usually required and a development 
environment that works in this case is necessary, es- 
pecially as the board design and construction pro- 
gresses. 

Even complex designs can have real estate con- 
straints, leaving no room for the extra hardware to 
support. a full operating system. A case example of 
this is shown in Figure 1. In order to fit this design 
on a PCI card, extra parts like UARTS had to be 
left out, and program memory had to be kept to one 
flash and 2 DRAM chips. 

Conventional operating systems usually serve two 
interesting roles: they abstract the target hardware, 
and they provide a means of loading and execut- 
ing programs, often in separate protection domains. 
An operating system provides an operating environ- 
ment, including but not limited to a device driver 
interface and a common interaction with the user. 
It is separated from applications by a kernel struc- 
ture, bounded by trap handlers or some form of call 
gate that allows the operating system to function to 


some degree independent of and protected from the 
applications that it carries. 

Several commercial embedded operating systems 
are available that run on the relatively conventional 
CPU in Figure 1, but most commercial operating 
systems, available in binary form, require board sup- 
port packages written to provide the necessary sup- 
port for the O/S, including a console, time ticks, and 
memory setup. 

The ISE board (Figure 1) in particular has no se- 
rial port, so program loading must be done either by 
programming the socketed FLASH memory with a 
prom programmer, or writing into the board support 
package a console driver that uses the PCI bus to 
communicate as a console. The MON960 monitor [8] 
supports the latter, and the Cyclone-911 board [4] 
in particular can be used this way, given the appro- 
priate host software.! 

Although it is sometimes nice to have an operating 
system that is portable, and essential that certain li- 
braries be portable, it is rare that an embedded pro- 
gram is, or should be, portable. The whole point of a 
program is to manipulate the specific toaster. There 
is no value being able to run the toaster program 
on the VCR. It therefore is rarely useful to have a 
device-driver interface in an embedded kernel—such 
can actually make things harder. 

We questioned the prudence of forcing a kernel- 
ized operating system onto a board with only a few 
LEDS and an oscilliscope for debug output, and a 
ROM socket for input. We anticipated this happen- 
ing often, as designing and building boards is our 
business. We also noted that the device driver inter- 
face of a kernel is pointless, and our targets typically 
run a single trusted program from reset to power off. 
We eventually concluded that we didn’t really need 
an operating system at all. 

This, then, became the chosen path. We wrote a 
minimal runtime to support C and C++ that works 
on the sorts of target boards expected, and we pro- 
vided that support for a specific compiler, the GNU 


1Picture Elements supplies with the ISE board a bootstrap 


loader that loads COFF files from the PCI bus. The loader is 
written using uCR and the techniques described in this paper. 
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GCC compiler. Writing the support for the compiler 
alone, we reasoned, would be easier then writing a 
board support package for compiler and operating 
system and would get everything needed without the 
added constraints of an operating system. 


This runtime support for the compiler, called 
uCR?, proved lightweight and powerful enough that 
we not only used it as the regular development envi- 
ronment, we used it to build bootstrap loaders and 
other programs in support of embedded development 
itself. 


2 uCR, C and C++ 


The difficulty of porting embedded operating sys- 
tems is more often dealing with the board, and not 
the CPU proper. It is the memory layout and I/O 
devices that make portable operating systems diff- 
cult. If one can reduce the development environment 
to something independent of the board, then there is 
only the CPU to worry about and not all the devices 
around it. uCR/i960 can run on any i960 without 
porting.® 

Eliminate the devices from the environment, and 
eliminate the system call gate, and what remains is 
a runtime that connects the compiler to the CPU. 
In the embedded world, anything is possible and im- 
posing irrelevent requirements like a console and a 
clock ticker can make things harder. 


The problem, then, simply reduces to how one 
maps C++ to a CPU, with nothing else. Although 
board specific code still needs to be written, specif- 
ically the reset handler and application code, uCR 
makes no requirements other then those needed to 
support the compiler. This is a task to remind one 
about the nature of a programming language. 


This 1s not quite the same as more conventional 
development environments requiring board support 
packages. Although uCR in practice needs board- 
specific startup code, it does not impose a style of 
interaction with the board and programmer. Typ- 
ical systems require of the board support package 
console drivers, timer drivers (with the timer config- 
ured to tick at a specific rate) and package drivers 
to initialize the options you choose to include. uCR 
imposes no such requirements. 


2uCR is an abbreviation of “Micro-C/C++ Runtime,” 
pronounced U-C-R 

3In practice, the uCR package, though not the core library, 
includes code specific to select boards in order to get the de- 
veloper started writing more complex programs. 


2.1 What uCR is 


uCR is more properly called a C/C++ runtime than 
an operating system. A programmer writes an ap- 
plication program in C++ and compiles with the 
uCR headers. The compiler generates assembly code 
from the source files that the programmer writes, 
and what the compiler cannot do it delegates to the 
execution environment. uCR provides the execution 
environment for the generated code. 

The programmer links the resulting object code 
with uCR libraries that fill in the parts left out by 
the compiler, and gets an executable image. This im- 
age is loaded into the target by prom programmer, 
ROM emulator, serial download—whatever works 
for you—and is executed. 

uCR libraries add support for thread program- 
ming and interrupt handlers. These are features de- 
pendent on the CPU and not the board, so includ- 
ing them does not introduce board support prob- 
lems. The uCR distribution also includes ancillary 
libraries that contain device classes, and other code 
that may not be specific to the CPU but is com- 
monly used. 

Interrupt handler support is also included with the 
uCR core library, because again it is a matter for the 
CPU and compiler how interrupt service routines are 
entered and left, and not specific to target boards. 
What the target boards do fix is the assignment of 
devices to specific interrupts, so uCR makes no at- 
tempt to guess such things. 


2.2 What uCR is not 


uCR is not a kernel, or a micro-kernel. There are no 
system calls and there are no task structures such as 
page translation tables or system call gates. Calls to 
uCR operations are ordinary function (or method) 
calls. 

uCR, also is not intended to abstract the board 
away. Operating systems that try to abstract the 
hardware away wind up instead requiring that the 
hardware be a certain way. ‘That is frequently 
counter-productive. The uCR core does nothing to 
devices other than the CPU, and does nothing to get 
in between devices and the programmer. 


2.3 Requirements of the Languages 


Both C and C++ place some minimum requirements 
on the execution environment, but many of the con- 
straints are imposed by the compiler, not the lan- 
guage. If some language construct can be handled 
easily by the CPU, then the compiler typically just 
generates the assembly code to deal with it. That is 


Conference on Object-Oriented Technologies and Systems - June 16-20, 1997 


133 


134 


what compilers are for. However, when something is 
too hard, it gives up and generates a call to external 
code. 

The C language is relatively easy to compile and 
the compiler only generates call instructions for calls 
to external functions. Floating point emulation 1s 
often placed in a library as well, if the target can 
reasonably be expected not to have a floating point 
unit. The C++ language is a bit more interesting. 
It has difficult constructs that compilers often give 
up on, like dynamic memory allocation. 

The C language standard has a substandard, the 
freestanding C standard [1], to guide the imple- 
menter on what can be left out of the development 
environement and still be worthy of the name “C”. 
In a nutshell, libraries are optional in a freestanding 
environment. It. is rare for an environment to not 
include some of the more important optional parts, 
though. 

The C++ Working Paper has a similar substan- 
dard.* [3] The standard libraries are not required 
of a freestanding C++ environemnt. Only a few 
support libraries, some specific C library routines, 
and support for the “new” and “delete” operators 
are expected. ‘Things like streamio are certainly 
not required of a freestanding C++ execution en- 
vironment, although a specific implementation may 
choose to provide it. 

Static initializers must obviously be done cor- 
rectly. To fail to initialize static objects is a clear 
and gross error, but the compiler certainly does not 
know how to arrange that on my toaster CPU. That, 
like the minimum library support, becomes a matter 
for the runtime, namely uCR. 

The g++ compiler generates external calls for new 
and delete. This way, memory allocation can easily 
be provided with some help at link time. Most tar- 
gets have memory available for allocation, and some 
have several different kinds of memory for allocation. 
Even if the default allocation operators do not apply, 
the placement “new” operator has some interesting 
advantages. 


2.4 Requirements of Common Sense 


Ultimately, it is not enough to just make the com- 
pier happy. The programmer using the compiler 
is the real customer and the programmer wants to 
make devices do interesting things. It is therefore 
not useful to have a beautiful and fast string ma 
nipulation library if the programmer cannot fit the 
program in memory. 


4Strictly speaking, C++ doesn’t yet have any standard at 
all. 
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Therefore, any practical system should make as 
much of the standard libraries as reasonable avail- 
able to the programmer, without imposing extra 
costs. One programmer may not wish to pay for 
stdio, but another may find it worth while. 

Finally, we wanted a lot of power out of uCR, 
but simplicity and efficiency were most important. 
It is intended to be a language runtime, so certain 
standards, such as POSIX [5], were not considered 
desireable for embedded applications. We also tried 
to keep the size and complexity of the programming 
interface small and understandable. When testing a 
new board design, or debugging an older malfunc- 
tioning board, simple and obvious software behavior 
has its own special value. 


3 Object Oriented Design 


By using C++, Picture Elements gained a chance 
to use object oriented techniques in an embedded 
context, with threads of execution and interrupts. 
The obvious potential object classes are: 


e Threads, 
e Synchronization variables, 
e Devices, 


The Debugger, 


e Various containers. 


The various containers include ring buffers, lists, 
strings, and the other sorts of things one expects of 
object oriented designs, are not unique to embedded 
programming so will not be discussed here.[3, 10, 12] 

Incidentally, the requirement of keeping uCR sim- 
ple and to the point precluded creation of a large and 
complex system. Instead of a rich texture of objects 
and classes, we finished with a simple and elegant 
design, with a few general but very useful classes. 
A welcome benefit of this is that programmers can 
learn and be productive with uCR relatively quickly. 


3.1 Threads as Objects 


The thread programming interface for uCR was re- 
vised several times before the current interface was 
settled on. At first, we designed threads as ob- 
jects with interesting methods and put them in 
ThreadQueue containers. Eventually, however, we 
chose to attach most of the thread methods to the 
ThreadQueue class and left the THREAD a passive, 
opaque object. The run queue, we reasoned, would 
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be a ThreadQueue that the programmer can manip- 
ulate like any other ThreadQueue. Threads in the 
run queue would be subject to execution, and could 
be suspended simply by pulling the thread from the 
run queue and placing it elsewhere. 


class ThreadQueue { 
public: 
// Pass true to the flag to 
// put the thread in front. 
void enqueue(THREAD*, bool=false) ; 
THREAD *pull(); 
THREAD*peek () ; 


es 


This proved successful, though occasionally cum- 
bersome and less clear then the more conventional 
POSIX-style thread functions. uCR internally uses 
the ThreadQueue class for the run queue and sus- 
pension lists in synchronization primitives. ‘These 
instances are generally hidden from the application 
programmers, who have ultimately chosen to use 
POSIX-style thread functions provided in the uCR 
library. Programmers may use the ThreadQueue 
class to implement new synchronization primitives, 


if desired. 


The threads themselves are opaque objects of type 
THREAD, and are passed around to the ThreadQueue 
objects and thread manipulation functions like a 
thread identifier in a more conventional thread pack- 
age. The THREAD object is a completely opaque to- 
ken used by the programmer, and uCR, to represent 
the object that is a thread. This is more an abstract 
data type design then an object oriented design. It 
is a semantic quirk that this C abstract data type is 
much like a concrete object class in C++. 


The idea of a thread as an abstract class with a 
virtual method for its behavior is known to us. Some 
call this paradigm an active object. [2] Active objects 
are different from passive objects in that they have 
their own thread of execution and activate passive 
objects by calling methods. Threads and interrupt 
handlers are two different kinds of active objects, 
synchronization primitives and devices examples of 
passive objects. 


We experimented with active objects, but ul- 
timately decided to implement threads using the 
THREAD token and a set of conventional thread ma- 
nipulation functions. The traditional thread func- 
tions are well established, easy to use, small and 
generally don’t come into play once the threads are 
created and started. However, the specific interface 
built-in to uCR allows for efficient implementation 
of active objects if desired. 


3.2 Memory Heaps 


Support for memory allocation in uCR is itself writ- 
ten in C++. However, special care must be taken 
that all the data structures for managing the default 
heap in particular be in place before any static ini- 
tializers are called. To arrange for this, the default 
heap initializer is called separately and ahead of ini- 
tializers by the startup function of uCR. 

The HEAP_SPACE object is an object token like the 
THREAD type, and represents a segment of heap space 
from which memory may be allocated. The uCR 
startup creates an initial HEAP.SPACE object that is 
used by malloc and non-placement new operators. 

Embedded systems often have different kinds of 
memory, for example SRAM for small private ob- 
jects, or perhaps synchronous DRAM for manipula- 
tion of large images, and so uCR allows programmers 
to create other heaps in specified sections of address 
space. 

The uCR support for memory allocation adds a 
placement new operator that takes as a parameter 
the HEAP_SPACE to use. Because of the way the heap 
data structures are designed, deallocation does not 
require a reference to the HEAP_SPACE object. that 
created it. This feature was specifically included to 
support the delete operator. Any memory allocated 
with “new” or “new(HEAP_SPACE*)” can be deleted 
with “delete”. This is necessary because “delete” 
cannot be overloaded to take a HEAP..SPACE parame- 
ter, and to require such would render “delete” un- 
useable for memory allocated from alternate heaps. 


3.3. Synchronization Objects 


These classes were obvious and successful. uCR 
includes several synchronization classes, most de- 
rived from the base class ISync. The ISyne class 
is a concrete class that allows threads to wait for a 
general condition to become true, and allows other 
threads to notify of a possible change in state. This 
base class is the most primitive and general synchro- 
nization that allows threads to interact with other 
threads and interrupt service routines.° 


typedef bool (*sfun) (volatile void*) ; 
class ISync { 
public: 
void sleep(sfun fun, 
volatile void*) ; 
void wakeup(); 


.; 


“ISRs may not block so may not wait for a condition, but 
can and usually do report a potential change in state. 
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The ISema class is a counting semaphore imple- 
mented with the ISync class. The only operations 
supported are increment and decrement (and initial- 
ize with a specific value). The methods are easily 
implemented with the sleep and wakeup operations 
of the ISync class. 

The ILock class is a binary semaphore. In prin- 
ciple, it could be implemented with the ISema class, 
but it works out more efficiently derived directly 
from ISync. Operations for [Lock are get and put, 
and do the obvious things. 

ISema, ILock and other classes are implemented 
by deriving from the ISync class. They all have simi- 
lar fundamental behavior that can be easily factored 
into the base ISync class. The Mutex class imple- 
ments monitor style synchronization and does not 
fit well in the ISync class hierarchy, so it is imple- 
mented separately. 

The Mutex class has “enter()” and “leave()” 
methods to enter critical sections of code. Only one 
thread may be active in a critical section, hence the 
synchronization. The Condition class is a way for 
a thread to sleep within a critical section. A thread 
sleeping on acondition is not considered active in the 
critical section so other threads can enter. However, 
the implementations of Mutex and Condition assure 
that at all times there is no more then one thread 
executing in the critical section. 


3.4 Devices as Objects 


uCR proper does not operate devices, or expect any 
to be present, but ancillary libraries include classes 
that drive various devices. ‘The application may 
add more device classes simply by writing the code 
needed to manipulate device registers, and putting 
that code into classes. 

Devices make good objects. Programmers seem 
to understand and respond well to this technique. 
As an example, a uCR library includes classes for 
communicating with a host through a PCI bus. The 
class “BUS” has a subtype BUS: :Device that is the 
abstract type of a bus interface device. The PLX9060 
class is derived from BUS::Device and drives the 
PLX Technology PCI9060 [11] interface chip. The 
I960RP class is also derived from BUS: :Device and 
drives the ATU function unit of the Intel 1960rp [6] 
microprocessor. 

The BUS class in Figure 2 uses BUS: :Device ob- 
jects to implement a channel protocol with the host 
processor. The abstract BUS: :Device class has a 
minimum set of methods used by the BUS class for 
sending packets to the host and getting packets back. 

The classes derived from BUS: : Device implement 
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the minimum methods (which are pure virtual) and 
others that the device can support in addition to 
the minimum requirements. This is a fairly classic 
object oriented design. A similar hierarchy exists 
for timers (in case you choose to use them) where 
the abstract Ticker class provides a common in- 
terface for a generic clock and the derived classes 
implement the virtual methods as necessary for the 
specific hardware available. 


The Ticker class hierarchy in Figure 3 is another 
example of an object oriented device driver design, 
also taken from the uCR libraries. The Ticker 
base class provides the methods of a generic inter- 
val timer, that portable code may use. The concrete 
class derived from the Ticker drives the real hard- 
ware timer to implement the behavior of the abstract 
class. 


Even when inheritance does not make sense, 
device driver code fits well into classes. For 
example, the XC4000° class has the method 
device_configure() for programming the device, 
but does not specifically support or require any 
derivation. 


Device classes can be templates, too. The uCR li- 
braries include a template class LED that is a driver 
for light-emitting diodes. This template is also use- 
ful for controlling general purpose output bits, and 
other miscellaneous jobs. 


template <class RT> class LED { 
public: 
explicit LED(volatile RT*base) ; 
void set(unsigned idx) ; 
void clr(unsigned idx) ; 
re 
a 


Often, LEDS are connected to a register some- 
where in the address space of the processor. The 
register may be a word, or a byte, or whatever the 
hardware costs and design allow. The program- 
mer uses as the template parameter the integer type 
needed to access the word (for example “char” or 
“unsigned long”) and passes to the constructor the 
address of the register. Individual bits of the regis- 
ter can then be set or cleared with the “set()” and 
“clr()” methods. 


The nice feature of this template is that the pro- 
grammer can use custom types for RT that for ex- 
ample store its value in memory outside the address 
space, such as I/O space. 


6A Xilinx Field Programmable Gate Array 
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Figure 2: BUS Class hierarchy 
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to hardware tumers. The 1960Timer 
as a Ticker using the 1.960Jzx timer, 
and the TimerModule is a Ticker 
using the mc68322 timer module. 


TimerModule 





Figure 3: Ticker Class hierarchy 


3.5 The Debugger 


Some targets have sufficient I/O capabilities to sup- 
port an interface to a debugger, so uCR provides the 
GDB class. A GDB object is an interface to a suitably 
modified version of the GNU Debugger GDB that 
runs on a host computer. 

The GDB class is a concrete class that takes in 
its constructor a pointer to a suitable device class 
for use as the communication channel with the host. 
Once created, the debugger does not become avail- 
able to the host until its “go()” method is called, 
generally by a special thread. This method never 
returns and forever communicates with the host, re- 
celving and processing requests for action, memory, 
etc. 

The uCR core leaves things like breakpoint traps 
and faults available to the programmer, so the 
GDB class uses standard interfaces to gain access to 
threads and the faults they make. The debugger 
runs in a thread of its own, so can be said to be 
an active object. As an active object, it is free to 
manipulate other threads, stop them, run them, ex- 
amine memory, etc. 

It turns out that not much of what the proxy 
needs to do is CPU specific, and other then actual 
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I/O that communicates with the hosts, none of it 
involves board details. Many of the details of the 
target CPU are managed by the host GBD program, 
leaving only some CPU specific cache management 
and fault handling for the GDB class to cope with. 


3.6 Summary 


Table 1 summarizes the core set of types, plus a 
few others. Notice that this list is very short indeed. 
The uCR core really exists to provide a runtime con- 
text for the C++ program, and not to provide a lot 
of features. However, the thread and heap support 
tends to be compiler and CPU specific so must be 
provided in the core. 


Device drivers are not required by the compiler, 
or by uCR, but some types of devices are common 
enough to offer in a packaged library some classes 
to help the programmer. The BUS, Ticker and LED 
classes are examples of class library support for de- 
vices. The list of device types here is not exhaustive. 
By putting device support in a library instead of a 
kernel, the drivers can be brought in by the linker 
automatically if the device is used by a program. 
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Ri eee a La = 


lype 
THREAD 
ThreadQueue 
ISync 
ILock 
ISema 
Mutex 
Condition 
HEAP..SPACE 
BUS 
Ticker 


LED 
GDB 


Description 

Opaque object representing a single thread 
Container for THREAD objects 

Interrupt safe synchronization variable 
binary semaphore 

counting semaphore 

MONITOR style thread synchronization 
Condition variables used with Mutex objects 
Opaque object representing a memory heap 
BUS interface abstract class 

Timer device abstract class 

L-E-D and output bit driver template 
Proxy for the GNU Debugger 





Table 1: Common uCR Object Types 


4 Performance 


All the cleverness in the world is useless if the result- 
ing design performs poorly. This in fact is where we 
met the most resistence from C programmers. The 
premise of most criticism is that C++ code leads 
to inferior executables, either bloated by support 
for C++ capabilities, or somehow merely unopti- 
mizable, 


Much effort went into proving by example that 
tight and efficient C++ code is certainly possible. 
Since we chose to stick with a specific compiler, we 
had the luxury of studying the output assembly code 
and working it until the assembly couldn’t be further 
improved. That experience has led to some general- 
izations. 


4.1. Branching and Method Calls 


Compilers are good enough that sequential sections 
of code are optimized very well and branches are the 
bulk of the execution time and code space. These 
are the logic of the program, and cannot usually be 
eliminated. The optimizer can be helped by avoiding 
conditional code, short loops should be unrolled, and 
it may be best to not branch around useless code in 
certain cases. 


Object oriented designs, and C++ programs in 
particular, tend to introduce many smaller functions 
that perform near-trivial operations. For example: 
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class Foo { 
ieee 
int value() 
{ return value_; } 
unsigned size() const 
{ return 16; } 
epee 
J 


The call to value() can be reduced to a single 
“mov” or “ld” instruction on most types of CPUs, 
and the size() method can be optimized completely 
away. However, if those methods were not defined 
inline then the compiler would be forced to generate 
a call and a return, would need to invalidate reg- 
isters and/or shift register windows, and otherwise 
multiply the complixity. 

By inlining, the call instruction is eliminated and 
the basic block is expanded to surround the method 
invocation. Subexpression elimination and regis- 
ter allocation can be applied more globally and 
code around the method call shrinks along with the 
method call. 

C programs can also benefit from this technique. 
Linux source code, for example, is filled with tiny 
inlined functions of this sort. They are easier to 
read then similar macros, and more clearly express 
intent to the compiler. 


4.2 Virtual Methods 


Implementing virtual methods usually means an ex- 
tra memory access before the call to the method. 
This sounds like a performance problem, but in fact 
it turns out to be a useful optimization, when used 
wisely. 

A call to a C++ virtual method is often used to 
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invoke a context-specific behavior. For example, one 
might use virtual methods to perform I/O on an 
abstract class Device. The typical C equivilent is to 
keep function pointers in a function, and use those 
function pointers to perform the operation. 

The typical C code has the function pointers in 
the structure with the other variables, making the 
structure larger. Every instance has pointers to all 
the functions, so there are many pointers to every 
function. A more efficient way to do this is to keep 
in the structure only a pointer to a set of functions. 
Thus, the structure has only one pointer to represent 
the behavior functions. 

C++ compilers do this automatically. The virtual 
table is generated by the compiler for each type, and 
placed in constant memory. Here is an example, for 
the 1960, of a call to a virtual method following a 
pointer in sfoo: 


id _sfoo,g5 
1d 8(g5),g4 
ldis 8(g4),g0 
1d 12(g4) ,g4 
addo  g5,g0,g0 
callx (g4) 


This generated code loads into g4 the pointer to 
the function to be execute, and into gO the pointer 
to the object. The external pointer variable “sfoo” 
contains the pointer to the object to be manipulated, 
and the virtual method takes no parameters. With 
carefully crafted C code, the programmer can save 
maybe one instruction here. 

Obviously, however, this code is far too much to 
use for acessor methods, such as “Foo: :value()” 
above. Although the virtual method performs a use- 
ful function efficiently, it is clearly too expensive for 
trivial functions. The common technique of creating 
a virtual method that returns a constant value to 
reveal the true object type is inefficient. If you must 
do this, run time type information is more practical. 

One unexpected benefit of virtual methods is that 
in certain cases the compiler can tell a priori which 
implementation of a virtual method applies, and can 
optimize all the above away, as in: 


ld _sfoo,g0 
ld _sfoot4, g4 
cmpible g4,g0,L4 
mov g4,g0 


L4: 


In this example, “sfoo” is an object and not a 
pointer. C++ knows exactly which implementation 
applies, and took the liberty of implementing the 


method inline. The only difference between this and 
the previous example is that “sfoo” 1s known ex- 
actly to be of type Foo. 


4.3 Assembly Code 


When dealing with “bare iron,” some assembly code 
is inevitable. ‘There is, for example, no reason- 
able way to implement thread switching entirely in 
C or C++, and in most processors interrupt and 
trap handlers must have assembly code to save the 
context of an interrupted thread and setup a fresh 
C/C++ calling sequence. 

However, assembly code should typically be kept 
to a minimum. A human can do a good job of opti- 
mizing a small stretch of assembly code, but as the 
code grows, the human capacity to manage resource 
allocation becomes overwhelmed and the compiler 
performs better. 

The paradox is that short assembly functions are 
more likely to be inefficient if placed in seperate 
source files, as the call and return overhead starts 
to overwhelm. What we really want is a way to 
write inline assembly code. Look at the following 
example: 


inline int isr_hot_flag() 


{ 
register unsigned tmp; 
asm("modpc 0, 0, 40" : "=r" (tmp)); 
return tmp & 0x2000; 

} 


This code returns true if the caller is running in an 
interrupt handler on an i960. The “modpc” instruc- 
tion takes around 14 clock cycles to execute (it is 
slow) but that is not too bad. The call and return 
instructions on the 1960Jx each consume 6 cycles and 
also push a register set, leading to at least 12 cycles 
for the call/return pair. Putting this function in a 
seperate source file would double the execution time 
of the function and may cause a register cache spill 
as well. Inlining this function is therefore rather im- 
portant and powerful. 

The benefits do not stop there. The GNU com- 
piler syntax for inline assembly allows the program- 
mer to match up registers and supply constraints 
that allow the C/C++ compiler to include the code 
in the optimization phase. The following degenerate 
case: 


int foo() { return 0 && isr_hot_flag(); } 


obviously leads to the following optimized 1960 as- 
sembly code: 
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_foo__Fv: 
mov 0,g0 
ret 


More complex situations are possible, but the 
point here is that the inline assembly can and should 
be included in the optimization passes of the com- 
piler for the best results. A more complex example 
for the i386[7] works as follows: 


inline int call_hosti(int sys, int parm) 
{ 
asm ("int \$0x80\n" 
"=a" (sys) 
"0" (sys), “b" (parm) )% 
return sys; 


} 


The previous code makes a system call by putting 
the parameter in the ebx register and the system call 
number in the eax register, and calling “int $0x80” 
to trap to the system. The compiler now knows how 
to allocate the eax and ebx registers around this as- 
sembly statement and can use that knowledge when 
optimizing register allocation around it. It can also 
eliminate the whole mess if the result is not at all 
used. ’ 

This point doesn't have much to do with object 
oriented design (except that you can write meth- 
ods in assembly code) but has everything to do with 
writing the low-level code in C++. If assembly code 
could not be inlined, an efficient implementation 
would necessarily pull more code into the assem- 
bly files to cut down on the cost of function call 
overhead. At that point, the C++ compiler would 
become more a hindrance then a help. 

Including assembly code inline, and using con- 
straints to contro] the optimizer, eliminates much 
of the interface overhead between C++ and assem- 
bly and gives the programmer the best of the C++ 
and assembly worlds. 


4.4 Templates 


This brings up an interesting theoretical optimiza- 
tion feature of templates. If in the previous example 
the class Foo were a template, the size() method 
could just return a template parameter. Thanks to 
template semantics, calls to the size() method are 
subject to constant elimination, which may in turn 
lead to dead code elimination. That is an exam- 
ple of the compiler deciding on the course of some 


"If the assembly statement has side effects and should not 
be eliminated, it can be declared “volatile” and the compiler 
will preserve it. 
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branches, and eliminating the actual branch instruc- 
tion from the execution stream. 

Similar code in C uses preprocessor macros to get 
the same benefits, and is not type-safe. ‘Templates 
used this way save much space and execution time, 
and are more readable. Use templates as a means 
of manipulating the type structure of the program, 
and not a way to create code, and templates may 
actually reduce to take no code space at all. 

If template methods are not inlined, the compiler 
must generate an implementation of the template, 
often in a different compilation unit. To make mat- 
ters worse, much near-duplicate code may be gen- 
erated. For example, instantiations with the “int” 
type and the “unsigned” type may generate iden- 
tical code for a small inlined method but generate 
unique implementations for complex out-lined meth- 
ods. Comparisons, for example, require different as- 
sembly code for “int” and “unsigned” values. 

Templates are therefore a terrific way to generate 
lots of excess code when used in this fashion. When 
inlined, the compiler may use template semantics to 
implement efficient, locally optimized code. Other- 
wise, they generate lots of redundant code. 

Compiler writers are still arguing over how to deal 
with template repositories and the like, but in prac- 
tice, for our purposes, the issue is moot. Large tem- 
plate classes or functions will expand to consume all 
available ROM. Practical templates should be small 
enough to be inline, and if inlined evérywhere, there 
is no need for a template repository. 


4.5 Smaller is Better 


Software tends to expand to fill available memory, 
and programmers tend to judge their creations by 
the number of lines of code. There are two good 
reasons for worrying about program size, even when 
virtual memory is available and inexhaustible. 

The most obvious reason to embedded program- 
mers is that memory costs money and board space. 
Actually, it is the hardware designers who notice 
this first, and design in small amounts of mem- 
ory. The programmer then asks for an additional 
4Meg of flash memory and learns that 512Kbyte per 
chip means that 4Meg is 8 chips and the board just 
doesn’t have that much space. 

Large programs also tend to run slower. If it 
isn’t simply because all those instructions take a 
long time to fetch from memory, it’s because large 
programs overflow the instruction cache more often. 
Even dead code tends to spread the useful code into 
a larger address space and can lead to cache misses. 
Thanks to the widespread use of caches, memory 
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and instruction, large programs are slow programs. 


4.6 Link-time Efficiency 


After all the compilation units are compiled, the pro- 
gram must be linked together with the run time li- 
brary to create a single executable. On conventional 
operating systems, there are shared objects to be 
linked to, and the kernel to be made available. In 
a kernel-based operating system the common func- 
tions are placed in a protected kernel that all the 
applications share. 

In an embedded environment, where there is only 
one application, the kernel is not being shared so its 
size should be included as part of the cost of the 
application. uCR is not kernelized, it is presented 
as a library that the linker uses to resolve symbols. 
Only the parts of uCR that are explicitly used are 
allocated space in the final image. 

This is an example program, linked with uCR, 
that has only an empty main and the code necessary 
to enable all the devices on a Picture Elements ISE 
board. The text section includes executable code 
and constants and can be placed in ROM. The bss 
section 1s large because it contains generous stack 
space (10K) for the main thread. 


dec filename 
15416 file.exe 


bss 
10664 


data 
1232 


text 
3520 


The following is the same program compiled for 
Linux/SPARC. The stack space is not included in 
the totals, nor is the linux kernel itself or any of the 
4 shared objects that this image links to. 


dec filename 
5271 a.out 


data bss 
2575 8 


text 
2688 


Ignore the bss sections (mostly stack space in 
file.exe) and the linux image is about the same size. 
The linux image 1s for sparc whereas the uCR image 
is for 1960 so the differences may easily be due to 
instruction set constraints. 

What is significant about this example is that the 
entire uCR image takes up no more space then an 
image linked to run in an environment that has sev- 
eral megabytes of uncounted resources in the host 
kernel and shared objects. These resources provide 
a very important value in the case of Linux, but none 
of them are of interest in an embedded target. 

As the program becomes more interesting, the 
uCR image sizes naturally increase. For example, 
the following numbers apply to a program that has 
included a debugger interface and code to drive a 
PCI bus interface, along with an implementation of 


a channel protocol that communicates with a driver 
on a host computer. In this image, 16K of buffer 
space is included in the bss number, along with the 
main stack. 


dec filename 
42652 file.exe 


bss 
27284 


data 
2824 


text 
12544 


The advantage of placing the uCR infrastructure 
in a library is that only the parts actually used are 
brought into the image. This can include individual 
methods of a class (those that are not inlined). A 
kernel image, on the other hand, must be included 
all at once or not at all. 


5 Java, Anyone? 


We have considered, and are still considering, the 
use of Java in embedded programming. ‘There is an 
important problem with it, however, that reduces its 
usefulness, and that is the lack of an “asm” state- 
ment, and other inlining. 


It turns out, when dealing with physical hardware, 
that there is always some little bit of assembly code 
that needs to be written. The obvious first thought 
is to put the assembly code is a library somewhere 
and call it when needed. But that is not necessarily 
the right answer. Consider the following familiar 
example for an Intel 1960 microprocessor: 


inline int isr_hot_flag() 


dt 
register unsigned tmp; 
asm("modpc 0, 0, 40" : "=r" (tmp)); 
return tmp & 0x2000; 

Z 


This function basically reduces to the single in- 
struction, the “modpc” instruction. Put that in a li- 
brary and you get around it two branches and some 
register file shuffling, maybe even a few memory ac- 
cesses. Wrap it up in a java class somewhere, and 
you also get the overhead of leaving and entering 
java. 


A just-in-time compiler for java byte code would 
address many performance issues by generating un- 
rolled directly executable machine code as the pro- 
gram runs. A syntax would be needed to specify 
specific assembly instructions for inclusion in the 
stream, or the compiler will not be able to match 
the optimization performance of C++. 
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6 Conclusions 


uCR as a whole has become a richer environment, 
now that it works with more substantial processor 
boards. However, its original design goal to stay 
small and efficient seems to be working. The core 
library of uCR is still quite simple. We have also 
come to some conclusions on the matter of C++ 
and embedded programming. 

We to this day face people telling us that C++ 
generates inefficient code that cannot possibly be 
practical for embedded systems where speed mat- 
ters. The criticism that C++ leads to bad exe- 
cutable code is ridiculous, but at the same time ac- 
curate. Poor style or habits can in fact lead to awful 
results. On the other hand, a skilled C++ program- 
mer can write programs that match or exceed the 
quality of equivilent C programs written by equally 
skilled C programmers. 

The development cycle of embedded software does 
not easily lend itself to the trial-and-error style of 
programming and debugging, so a stubborn C++ 
compiler that catches as many errors as possible at 
compile time significantly reduces the dependence 
on run-time debugging, executable run-time support 
and compile/download/test cycles. This saves un- 
told hours at the test bench, not to mention strain 
on PROM sockets. 


7 Epilogue 


There are times when the proper development envi- 
ronment for a project is not an operating system at 
all. For these times, run time suppport is all that is 
required for convenient development. If an operat- 
ing system does not contribute to the solution, it is 
part of the problem and should not be there. 

Without an operating system, however, the code 
generated must stand alone on the target hardware. 
The runtime support necessary to allow this, and 
even to add some extra features normally associated 
with operating systems (like threads and memory 
management) is fortunately not difficult or expen- 
Sive. 

uCR does a good job of providing the necessary 
runtime support to the compiler, with little over- 
head. Its small size comes from a careful attention 
to the details of performance, and stubborn controll 
of feature creep. The core design allows interesting 
functionality to be placed in libraries outside of uCR 
and brought in by the linker only if a programmer 
uses it. 

With uCR, it is possible to get software for a new 
board up and running, if the CPU is already sup- 
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ported, in afew hours. It is an important milestone 
to get a trivial program running an infinite loop on a 
new processor board and it is best to not invest days 
getting there. Once trivial programs work, more ex- 
tensive hardware debugging can commence. 

The ongoing work on uCR 1s available for anony- 
mous ftp from the Picture Elements ftp and web site. 
including the documentation, at: 


http://www.picturel.com/ucr/uCR.html, and 
ftp://ftp.picturel.com/pub/source. 


It is indeed interesting that efforts to reduce code 
size and increase speed have led to the conclusion 
that significant portions of the source code must be 
visible to the programmer in the form of inline func- 
tions. 
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Abstract 


Sometimes, it is desirable to alter or optimize the be- 
haviour of an object according to the needs of a specific 
portionof the source code (1.e., context), such as a partic- 
ular loop or phase. One technique to support this form of 
optimization flexibility is anovel approach called scoped 
behaviour. Scoped behaviour allows the programmer to 
incrementally tune applications on a per-object and per- 
context basis within standard C++. 

We explore the use of scoped behaviour in the imple- 
mentation of the Aurora distributed shared data (DSD) 
system. In Aurora, the programmer uses scoped be- 
haviour as the interface to various data sharing optimiza- 
tions. We detail how a class library implements the basic 
data sharing functionality and how scoped behaviour co- 
ordinates the compile-time and run-time interaction be- 
tween classes to implement the optimizations. We also 
explore how the library can be expanded with new classes 
and new optimization behaviours. 

The good performance of Aurora suggests that using 
scoped behaviour and a class library is a viable approach 
for supporting this form of optimization flexibility. 


1 Introduction 


Optimizing a program’s data access behaviour can sig- 
nificantly improve performance. Ideally, the program- 
ming system should allow each object to be optimized 
independently of other objects and each portion of the 
source code (1.e., context) to be optimized independently 
of other contexts. Towards that end, researchers have 
explored various compiler and run-time techniques to 
provide per-object and per-context flexibility in applying 
an optimization. 

We describe how scoped behaviour, a change in the 


implementation of methods for the lifetime of a language 
scope, can provide the desired optimization flexibility 
within standard C++. A language scope (i.e., nested 
braces in C++) around source code selects the context 
and the re-defined methods implement the optimization. 
Scoped behaviour requires less engineering effort to im- 
plement than compiler extensions and it is better inte- 
grated with the language, thus less error-prone to use, 
than typical run-time libraries. 

Specifically, we focus ona single application of scoped 
behaviour: supporting optimized distributed data sharing. 
Since this discussion 1s closely tied to a particular problem 
domain, we begin with a brief introduction to distributed 
data sharing. Then we provide an overview of the Aurora 
distributed shared data system [Lu97], detail how scoped 
behaviour and the class library are implemented, and 
discuss some performance issues. 


2 Distributed Data Sharing 


Parallel programming systems based on shared memory 
and shared data models are becoming increasingly pop- 
ular and widespread. Accessing local and remote data 
using the same programming interface (e.g., reads and 
writes) 1s often more convenient than mixing local ac- 
cesses with explicit message passing. 

On distributed-memory platforms, the lack of hard- 
ware support to directly access remote memories has 
prompted a variety of software-based, logically-shared 
systems. Broadly speaking, there are distributed shared 
memory (DSM) [Li88, BCZ90, ACD* 96] and distributed 
shared data (DSD) [BKT92, SGZ93, JK W95] systems. 
Support for distributed data sharing, whether it is page- 
based as with DSM, or object-based (or region-based) as 
with DSD, 1s an active area of research. The spectrum of 
implementation techniques spans special hardware sup- 
port, run-time function libraries, and special compilers. 
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Layer 
Programmer’s Interface 





Scoped behaviour 
Shared-Data Class Library 


| Main Components and Functionality 


Teams of threads for SPMD-style parallelism, active objects 
Distributed vector and scalar objects 





Handle-body shared-data objects 


Overloaded operators and special methods; immediate data access (default behaviour) 


Data sharing optimizations 








Run-Time System 


Active objects and remote method invocation (currently, ABC++) 
Threads (currently, pthreads) 
Communication mechanisms (currently, shared memory and MPI) 


Owner-computes, caching data for reads, release consistency for writes 






Table 1. Layered View of Aurora 


(a) Original Loop : (b) Optimized Loop Using Scoped Behaviour 


GVector<int> vectorl1( 1024 ); 


for( int is= 0; i < 1024; i++ ) 
vectorl[ i] = someFunc( i ); 





GVector<int> vectorl( 1024 ); 


{ // Begin new language scope 
NewBehaviour( vectorl, GVReleaseC, int ); 


Or ( Int. ae 0s ae O28 ies) 
vectorl[ i ]) = someFunc( i ); 


] // End scope 


Figure 1. Applying a Data Sharing Optimization Using Scoped Behaviour 


In this context, the all-software Aurora DSD sys- 
tem provides a shared-data programming model on 
distributed-memory hardware. Al] shared data are encap- 
sulated as objects and are accessed using methods. To 
overcome the latency and bandwidth performance prob- 
lems of typical distributed-memory platforms, Aurora 
provides a set of well-known data sharing optimizations. 

Although other DSM and DSD systems also offer data 
sharing optimizations, Aurora is unique in how these opti- 
mizations are integrated into the programming language. 
Pragmatically, scoped behaviour allows the applications 
to be incrementally tuned with reduced programmer ef- 
fort. Also, as an experimental platform, Aurora’s class 
library approach 1s relatively easy to extend with new be- 
haviours. In particular, one of the goals of this research 1s 
to support common data sharing idioms, specified and op- 
timized using scoped behaviour, with good performance. 


3 Overview of Aurora 


Aurora can be viewed as a layered system (Table 1). The 
key layers will be discussed later on, but we begin with a 
quick overview. 

Application programmers are primarily concerned 
with the upper two layers of the system: the program- 
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mer’s interface and the shared-data class library. The ba- 
sic data-parallel process model is that of teams of threads 
operating on shared data in SPMD-fashion (single pro- 
gram, multiple data). The basic shared-data model is that 
of a distributed vector object or a distributed scalar ob- 
ject. Once created, a shared-data object is transparently 
accessed, regardless of the physical location of the data, 
using normal C++ syntax. By default, shared data is read 
from and written to immediately (i.e., synchronously), 
even if the data is on a remote node, since that data ac- 
cess behaviour has the least error-prone semantics. 


Figure 1(a) demonstrates how a distributed vector ob- 
ject 1s instantiated and accessed. GVector 1s a C++ class 
template provided by Aurora. Any built-in data type or 
user-defined structure or class can be used as the tem- 
plate argument. The size of the vector is a parameter 
to the constructor and, currently, the vector elements are 
block distributed across the physical nodes. 


Now, for example, if a shared vector is updated in 
a loop and if the updates do not need to be performed 
immediately, then the loop can use release consistency 
[GLLt 90, AG96] and batch the writes (see Figure 1(b), 
shown side-by-side for easy comparison). Without any 
changes to the loop code itself, the behaviour of the up- 
dates to vector1 is changed within the language scope. 
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(a) Common Preamble 


Lit by ae 


// Prototype of C-style function with innermost loop 
iit? GOGPrOO( amit ean cane, hb vii es).7 ne. Ik): 


(b) Sequential Code 


/1 mA, mB, mC are 5]2 x 512 matrices 


EOU( ar = (0G a. a SU 2y eS) 
Fort: a= 0} a S12 ae) 
ME [ase 


dotProd( smA(iJ(0], mB, j, 512 ); 





’ (c) Optimized Parallel Code 


/1 mA, mB, mC are 512 x 512 GVectors 


{ // Begin new language scope 

NewBehaviour( mA, GVOwnerComputes, int ); 
NewBehaviour( mB, GVReadCache, int ); 
NewBehaviour( mC, GVReleasec, int ); 


while( mA.doParallel( myTeam ) ) 
for( i = mA.begin();i < mA.end();i += mA.step() ) 
for ( 


7 = 0% 
ne PL ps 
dotProd( smA[i][0], mB, j, 512 ); 


Veco Zen goa) 


) // End scope 


Figure 2. Matrix Multiplicationin Aurora 


The NewBehaviour macro specifies that the release con- 
sistency optimization should be applied to vector 1. 

Therefore, scoped behaviour is the main interface be- 
tween the programming model and the data sharing opti- 
mizations, providing: 


e Per-object flexibility: The ability to apply an op- 
timization to a specific shared-data object without 
affecting the behaviour of other objects. Within a 
context, different objects can be optimized in dif- 
ferent ways (i.e., heterogeneous optimizations). 


e Per-context flexibility: The ability to apply an op- 
timization to a specific portion of the source code. 
Different portions of the source code (e.g., differ- 
ent loops and phases) can be optimized in different 
ways. 


The lowest layer of Aurora, the run-time system, pro- 
vides the basic thread management and communication 
mechanisms. The current implementation of Aurorauses 
the ABC++ class library for its active object mecha- 
nism, an object that has a thread of control associated 
with it, and remote method invocation (RMI) facilities 
[OEPW96]. RMIs are syntactically similar to normal 
method invocations, but RMIs can be between objects in 
different address spaces. If desired, the application pro- 
grammer can directly utilize the active object and RMI 
mechanisms to implement amore control-parallel process 
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model. Also, although ABC++ already has a parametric 
shared region (PSR) mechanism, itis not used by Aurora. 

In turn, ABC++ uses standard pthreads [Pth94] for 
concurrency and either shared memory or MPI message 
passing [GLS94] for communication. 


4 Programmer’s Interface 


A more detailed description of the programmer’s inter- 
face to Aurora can be found elsewhere [Lu97], but we 
briefly touch upon the main ideas with an example. 


4.1 Example: Matrix Multiplication 


For illustrative purposes, consider the problem of non- 
blocked, dense matrix multiplication, as shown in Figure 
2. The preamble is common to both the sequential and 
parallel codes (Figure 2(a)). The basic algorithm consists 
of three nested loops, where the innermost loop computes 
a dot product and can be factored into a separate C- 
style function. An appropriate indexing function for two- 
dimensional arrays in C/C++ is assumed. 

Conceptually, we can view an optimization as achange 
in the type of the shared object for the lifetime of the 
scope. The current set of available behaviours is summa- 
rized in Table 2. As an example of per-object flexibility, 
three different data sharing optimizations are applied to 
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Scoped Behaviour | Description 


Threads access only co-located 
data. 


Owner-computes 


Caching for reads Create local copy of data. 
Release consistency | Buffer write accesses. 


Used with owner-computes for 
specific applications (e.g., sten- 
cils in 2-D diffusion simulation). 


Special-purpose 
data movement 





Table 2. Some Scoped Behaviours 


the sequential code in Figure 2(b) to create the parallel 
code in Figure 2(c). Specifically: 


1. NewBehaviour(mA,GVOwnerComputes, int): 
To partition the parallel work, the owner-computes 
technique is applied to distributed vector mA. 


Within the scope, mA is an object of type 
GVOwnerComputes and has special methods 
doParallel(), begin(), end(), and step(). 
Only the threads (each represented by a local 
myTeam pointer) that are co-located with a por- 
tion of mA’s distributed data actually enter the 
while-loop and iterate over their local data. Also, 
when dotProd() 1s called, a type constructor for 
GVOwnerComputes returns a C-style pointer to the 
local data so that the function executes with maxi- 
mum performance. 


Although some changes to the source code are 
required to apply owner-computes, they are rel- 
atively straightforward. Other work partitioning 
strategies, that do not use the special methods pro- 
vided by Aurora, are allowed, but owner-computes 
is both convenient and efficient. 


2. NewBehaviour(mB, GVReadCache, int): To 
automatically create a local copy of distributed vec- 
tor mB at the start of the scope, since it is read-only 
and re-used many times, its type is changed to 
GVReadCache. 


The scoped behaviour of a read cache also includes 
a type constructor so that dotProd() can be called 
with C-style pointers that point to the cache. Note 
that no lexical changes to the loop’s source code 
are required for this optimization. 


3. NewBehaviour(mC, GVReleaseC, int): To 
reduce the number of update messages to elements 
of distributed vector mc during the computation, its 
type 1s changed to GVReleasec. 


Within the scope, the overloaded operators batch 
the updates into a per-target address space buffer 
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and messages are only sent when the buffer is full 
or when the scope is exited. Also, multiple writers 
to the same distributed vector are allowed. No 
lexical changes to the source code are required. 


The result of thisheterogeneous set of optimizations is 
that the nested loops can execute without remote data ac- 
cesses and the parallel program can use the same efficient 
dotProd() function as in the sequential program. 


4.2 Discussion: Programming in Aurora 


The typical methodology for developing Aurora applica- 
tions consists of three main steps. First, the code 1s ported 
to Aurora. Shared arrays and shared scalars are converted 
to GVectors and GScalars. Although the default 1m- 
mediate access policy can be slow, its performance can 
be optimized after the program has been fully debugged. 

Second, the work is partitioned among the processors 
and threads. Owner-computes and SPMD-style paral- 
lelism are common and effective strategies for many ap- 
plications. However, the application programmer may 
implement other work partitioning schemes. 

Lastly, various data sharing optimizations can be tried 
on different bottlenecks in the program and on different 
shared-data objects. Often, the only required changes 
are a new language scope and a NewBehaviour macro. 
Sometimes, straightforward changes to the looping pa- 
rameters are needed for owners-computes. For example, 
in the matrix multiplication program, owner-computes 
can be applied to vector mc instead, with read caches used 
for both vector mA and vector mB. The dotProd() func- 
tion and the data access source code remain unchanged. 
The new optimization strategy uses more resources for 
read caches than the original strategy, but, since mc 1s 
being updated, it is perhaps a more conventional appli- 
cation of owner-computes. Reverting back to the orig- 
inal strategy is also relatively easy. For the application 
programmer, the ability to experiment with different op- 
timizations, with limited error-prone code changes, can 
be valuable. 


5S Scoped Behaviour 


Scoped behaviour ts a change in the implementation of 
selected methods for the lifetime of a language scope. 
For the Aurora programmer, scoped behaviour 1s how 
an optimization is applied to a shared-data object. For 
the system and class designer, scoped behaviour is an 
interface between collaborating classes that changes the 
implementation of the selected methods. Some of the 
ideas behind scoped behaviour have been explored as 
part of the handle-body and envelope-letter idioms in 
object-oriented programming [Cop92] (to be discussed 
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(a) Scoped Behaviour Macro 


#define NewBehaviour( XX, YY, 22 ) \ 


/! Macro provided by aurora.H 


GPortal<GVector<ZZ> > AU. ## XX( Xx ); \ 


YY<ZZ> XX( AU. ## XX ); 


template <class C_OrigHandle> 
class GPortal 
{ 
private: 
C.OrigHandle * save; 
public: 
GPortal( C.OrigHandle & h ) 
operator C.OrigHandle &() 
}: // GPortal 


(b) Source Code 


{ // Begin new language scope 


NewBehaviour( vectorl, GVReleaseC, int ); 


iL < 1024: 
= someFunc( i ); 


i++ ) 


FOr (Dt <1 0; 
] 


vector] [ 
} // End scope 


vectorl[ 0 ] 1; /! Immediate update 


{ save 
{ return *Save; } 





/! Class template provided by aurora.H 


/1 Saved handle 


= gh; ] // In: Constructor 


/! Out: Type constructor 


(c) After Standard Preprocessor Pass 


{ // Begin new language scope 


GPortal<GVector<int> > AU_vectorl( vectorl ); 


GVReleaseC<int> vectorl( AU.vectorl ); 


ie O24 ees 5) 


for(-cint. 4. 0; 
]) = someFunc( i ); 


vectorl [ 


} // End scope 
vectorl[ 0 ] de; 


/! Immediate update (still) 


Figure 3. Aurora’s Scoped Behaviour Macro 


further in Section 6.1). Scoped behaviour builds upon 
these ideas. 


5.1 Language Scopes and Scoped Behaviour 
Objects 


The main motivation for using language scopes to define 
the context of scoped behaviour is to exploit the property 
of name hiding. In block-structured languages, an iden- 
tifier can be re-used within a nested language scope, thus 
hiding the identifier outside of the scope. 

Instantiations of a class that are designed to be used 
within a language scope, and which hide objects outside 
the scope, are called scoped behaviour objects. 


5.2 Implementing Scoped Behaviour 


As shown in Figure 3(a), Aurora provides the scoped 
behaviour macro NewBehaviour and the class template 
GPortal via aheader file. Figure 3(b) shows the original 
programmer’s source code and Figure 3(c) shows the code 
after the standard preprocessor of the C++ compiler has 
expanded the macro. Again, the code is shown side-by- 
side for comparison. 

The NewBehaviour macro is parameterized by the 
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name of the original shared-data object, the type of the 
new scoped behaviour object, and the type of the vector 
elements.’ The macro instantiates two objects. The 
first object, AU_vectorl, is of type GPortal. Its sole 
function is to cache a pointer to the original object, which 
is passed as a constructor argument, and then pass it 
along to the scoped behaviour object’s constructor. The 
second object, the scoped behaviour object vector1 of 
type GVReleaseC<int>, hides the original object but 
can access its internal state using the pointer passed by 
AU_vectorl. Thus, the scoped behaviour object can 
mimic or change the functionality of the original shared- 
data object. 

We will discuss the implementation of these classes in 
more detail in Section 6, but we provide an overview of 
the basic ideas. 


Since the scoped behaviour object has the same name 
as the original vector1, the compiler will generate 


‘Note that it is a multi-line macro and the ## symbol is the standard 


preprocessor operator for lexical concatenation. Also, the prefix AU- is 
arbitrary and can be redefined, if necessary. 

Unfortunately, the more concise syntax of GVReleaseC<int> 
vectorl( vectorl ) conflicts with the C++ standard (i.e., the 
new vector] is passed a reference to itself, instead of to the original 
object), so an intermediary object is required. Fortunately, the macro 
hides the existence of the intermediary object. 
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the loop body code according to class GVReleaseCc in- 
stead of the original object’s class. However, the user’s 
source code does not change. Even though the origi- 
nal and scoped behaviour objects collaborate to imple- 
ment scoped behaviour, we can conceptualize it as tem- 
porarily changing the type of the original object. The 
NewBehaviour macro helps to hide this abstraction. 
Note that source code outside of the context of the op- 
timization continues to refer to the original GVector. 
Therefore, immediate update remains the default be- 
haviour outside of the scope, illustrating per-context flex- 
ibility. 

The class template GVReleaseC is designed to be- 
have exactly like GVector, except that the overloaded 
operators now buffer updates to the vector elements. 
Read accesses to the vector continue to be performed 
immediately, even if the data is remote. Thus, the 
class of a scoped behaviour object can selectively re- 
define behaviour on a method-by-method and operator- 
by-operator basis. 

Also, since vector1 1s anew object within the scope, 
dynamic run-time actions can be associated with the var- 
ious constructors and the destructor. In particular, the 
destructor flushes the update buffers to the vector so that 
all updates are guaranteed to be performed when the scope 
is exited. 

Although this description has centered on a particu- 
lar class, the basic scoped behaviour technique can be 
applied to a variety of classes and objects. The owner- 
computes, caching for reads, and other behaviours use the 
Same NewBehaviour macro and are based on the same 
design principles. 

Of course, the basic ideas behind the implementation 
of scoped behaviour are not new. The notion of nested 
scopes 1s fundamental to block-structured sequential lan- 
guages. The association of data movement actions with 
C++ constructors and destructors 1s also not new (for ex- 
ample, in ABC++). However, scoped behaviour is unique 
in that it coordinates the interaction of different classes 
to create per-object and per-context behaviours. 


5.3 Advantages and Disadvantages 


The advantages of scoped behaviour include: 


t. Standards-based implementation. Scoped be- 
haviour can be implemented within standard C++ 
as a preprocessor macro. The class library, to be 
discussed in the next section, 1s also standard C++. 


2. Flexbility of experimentation. Scoped behaviour 
makes it easy to add, modify, and remove be- 
haviours with minimal or no lexical source code 
changes. 
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3. Flexibility of implementation. The compile-time 
aspect of scoped behaviour allows the compiler 
(and implementor) to generate behaviour-specific 
code based on different classes. The run-time 
aspect of scoped behaviour allows dynamic be- 
haviour, such as data movement and interactions 
with the run-time system, to be associated with 
constructors and destructors. 


A disadvantage of scoped behaviour is that, since it is 
a programming technique instead of a first-class compiler 
feature, it cannot access the compiler’s symbol table for 
high-level analyses. A more general disadvantage is that, 
since the run-time behaviour depends on constructors and 
destructors with static invocation points, it cannot be di- 
rectly ported to a language like Java [Sun96]. Java is 
a garbage-collected language and the current definition 
does not have destructors in the same sense as C++. 


Compared to some other DSM and DSD systems, 
scoped behaviour has safety and performance benefits. 

For example, GVReleasec has been explicitly imple- 
mented with a constructor that takes a parameter of type 
GVectors. Therefore, programming errors involving in- 
compatible objects, such as trying to use release consis- 
tency withnormal C++ arrays, will resultin compile-time 
errors. More generally, as with all object-oriented sys- 
tems, methods are invoked on objects and thus it 1s impos- 
sible to pass the wrong shared-data object as a function 
call parameter. Also, the automatic construction and de- 
struction of scoped behaviour objects make it impossible 
for the programmer to omit a required data movement ac- 
tion at the end of acontext. Non-object-oriented function 
libraries may only be able to catch these forms of errors 
at run-time, if at all. 

As with some other systems, performance benefits 
can arise from exploiting high-level data access seman- 
tics. For example, GVReadCache is intended for data 
that is read-only and where most of the elements will be 
accessed during the context. Therefore, Aurora can read 
the data in bulk rather than demanding-in each portion 
of the data with a separate data movement action. Also, 
GVReleaseC Is intended for data that 1s updated but not 
read. Therefore, unlike some other systems, Aurora can 
avoid the overhead of demanding-in the remote data be- 
fore overwriting it. 


6 Shared-Data Class Library 


In this section, we take a detailed look at the design and 
implementation of the C++ classes for the shared-data 
objects and data sharing optimizations. By design, these 
classes collaborate to support scoped behaviour. 
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Figure 4. Handle-Body Composite Objects 


6.1 Handle-Body Composite Objects 


The main architectural feature of the shared-data class 
library is the use of the handle-body idiom to create com- 
posite objects [Cop92, OEPW96)] for shared data (Figure 
4). The handle object defines the programmer’s interface 
to the shared data. The body object (or objects) contain 
the actual data. 

The extra level of indirection afforded by a composite 
handle-body approach allows for: 


1. Data distribution. A distributed vector is a set of 
body objects and each body object can be located in 
a different address space or on a different physical 
node. The handle includes a partition object to 
abstract the distribution strategy and a directory 
object to keep track of the location of the bodies. 
A distributed scalar has a single body object. 


Figure 4 shows a distributed vector object with a 
handle and two body objects, where one of the 
body objects is on a different node than the handle. 


2. Location-transparent data accesses. Through 
overloaded operators in the handle, the distributed 
data can be accessed through a uniform interface, 
regardless of the location of the actual data. Thus, 
for a given vector index, the partition object deter- 
mines which body holds the data and the directory 
object provides a pointer to the body object. 


3. Cheap parameter passing of shared data. Only 
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handles are passed across function calls; the data 
in the bodies are not copied. Handles can also be 
passed between address spaces, if desired, since 
the partition and directory objects are sufficient to 
locate any body object from any address space. 


For performance-sensitive functions, such as 
dotProd() 1nFigure 2, the overheads of indirection can 
be avoided in controlled ways through type constructors 
that return C-style pointers. 

The current implementation of Aurora creates handles 
as passive (1.e., regular) C++ objects. However, each in- 
dividual body is implemented as an active object, which 
is useful for implementing any necessary synchroniza- 
tion behaviour. Handle and body interact using remote 
method invocations. The run-time system automatically 
selects between shared-memory and message-based com- 
munication mechanisms for transmitting RMIs. 


6.2 Class Hierarchy for Handles 


Since most of the data sharing functionality is imple- 
mented in the handles, this discussion will focus on the 
handle classes. Briefly, however, the body classes sup- 
port get() and put() data access methods, including 
batch update and block-read variations. For the current 
data sharing optimizations in Aurora, this simple func- 
tionality is all that is required. 


Figure 5 1s a diagram of the main classes in the class hi- 
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Figure 5. Class Hierarchy for Handles 


erarchy for shared-data handles.”? Aside from the names 
of the classes, the diagram shows the relationship be- 
tween classes. The is-a relationship is the usual notion 
of inheritance. For example, class GHand1e 1s the base 
class for all handles. Common access methods are fac- 
tored into the base class. The holds-a relationship exists 
when a class contains a pointer (or pointers) to an in- 
stance of another class. This is used, for example, to 
allow one object to access the internal state of another 
object. The creates-a relationship exists when at least 
one of the methods of a class returns an object of another 
class. For example, an overloaded subscript operator 
(1.e., operator []) can retum an object which encodes 
information about a specific vector element [Cop92]. 

We can also distinguish the classes by the way they are, 
or are not, templated. Class GHandle is not templated 
in order to simplify the implementation of mechanisms 
that only require limited functionality from a handle. For 
example, querying about the number of vector elements 
does not require knowledge about template arguments. 
However, the most important class templates for the sys- 
tem implementor are parameterized by both the data ele- 
ment type and the class of the body object. 

In general, the application programmer is only ex- 
pected to use the classes with a single template argument 
for the data element type (labelled “User” in Figure 5 and 
highlighted in gray). These classes hide the more com- 


2The notation is based on Booch [Boo91], but with some sim plifi- 
cations and changes to better suit this presentation. 
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plex templating and class hierarchy considerations that 
the “System” must deal with. 


For data sharing using immediate access, the important 
classes are GSHandle and GVHandle (shown inside the 
box in Figure 5). These classes encapsulate member data 
to keep track of the body or bodies. 


Figure 6 provides a more detailed look at the interfaces 
for the classes that implement the shared vector. Class 
GHandle, which is not templated, is a convenient base 
class within which to implement methods common to all 
handles. Class GVector does little more than specify 
the specific body class (i.e., LVector) for the second 
template argument to GVHand1le and call the appropriate 
constructors. 


Most of the functionality for the shared vector is imple- 
mented by class GVHand1le. In particular, the overloaded 
subscript operator returns an object of type GPointerSC, 
which is a pointer object. When evaluating C++ expres- 
sions involving objects and overloaded operators, tempo- 
rary objects represent the result of sub-expressions. Since 
the actual data for a term may be a remote shared data 
element, the temporary object points to the body object 
with the data. Class GPointerSc has data members to 
store the vector index and a pointer to the specific body 
ob ject with that element. Reading from or writing to the 
vector element invokes the appropriate type constructors 
and the overloaded assignment operator of GPointerSC, 
resulting in an immediate remote memory access. 
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/! Base class. Not templated. 
class GHandle 
{ 
private: 
int numElements; /! Number of vector elements 
/! ..other data members... 
public: 
// ..various constructors and destructor... 
int size() { return numElements; ] // Common access method 
I! ..other methods... 
)} ; // GHandle (System) 


/! Template argument C_Data is the element type; C_LV is the body class. 
/! Classes GVScopedHandle, Partition, Directory, GPointerSC are provided by Aurora. 
template <class C_Data, class C_LV> 
class GVHandle : public GHandle /1 is-a GHandle 
{ 
/! GVScopedHandle needs access to internal state (for holds-a) 
friend GVScopedHandle<C.Data, C_LV>; 
protected: 
Partition<MAX-_LOCALS> partition; // Distribution strategy 
Directory<C_LV> directory; // Location of body object(s) 
// ...other data members... 
public: 
GVHandle( int numElements ); /! Construct with size of vector 
~GVHandle(); 
GPointerSC<C_LV, C_Data> operator[] ( int index ); // Immediate data access (creates-a) 
I! ..other methods... 
}; // GVHandle (System) 


/! Template argument C_Data is the element type; LVector (provided by Aurora) is the body class. 
template <class C_Data> 
class GVector : public GVHandle<C_Data, LVector<C_Data> > // is-a GVHandle 
( 
public: 
GVector( int numElements ) : /! Construct with size of vector 
GVHandle<C_Data, LVector<C_Data> >( numElements ) {} 

“GVector(); 

// ...inherits operator[] and other methods... 
}; // GVector (User) 





Figure 6. Interface for Shared Vector: GVector 
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/! Template argument C_Data is the element type; C_LV is the body class. 
/! Remember that Lama friend of GVHandle. 
template <class C_Data, class C-LV> 


class GVScopedHandle : public GHandle /1 is-a GHandle 
{ 
protected: 
GVHandle<C_Data, C_LV> * origHandle; /! To access internal state of original object (holds-a) 
I! ..other data members... 
public: 
GVScopedHandle( GVHandle<C_Data, C_LV> & gv ) /! Construct with original handle 
[ origHandle = &gv; ]} /! Cache the handle 


“GVScopedHand1le( ) ; 
/! ...other methods... 
); /| GVScopedHandle (System) 


// Template argument C_Data ts the element type; C_LV is the body class. 
/! Classes Cache, BatchWrite, and GPointerRC are provided by Aurora. 
template <class C_Data, class C_LV> 


class GVRWBehaviour : public GVScopedHandle /! is-a GVScopedHandle 
[ 
| protected: 
Cache<C_Data, C_LV> * readCache; /! Configurable read cache 
BatchWrite<C_Data, C_LV> * updateBuf [MAX_LOCALS] ; /! Configurable buffer for release consistency 
// ...other data members... 
public: 
GVRWBehaviour( GVHandle<C_Data, C_LV> & gv ) : /! Construct with original handle 
GVScopedHandle<C.Data; C.LV> >( gv ) {} 

“GVRWBehaviour (); /! Destructor flushes update buffers if necessary 
createCache(); /| Method to create read cache 
allowUpdateBuf () ; /! Method to allow update buffers 
GPointerRC<C_LV, C.Data> operator[] ( int index ); /1| Data access via cachelbuffer (creates-a) 


// ..other methods... 
]; // GVRWBehaviour (System) 


// Template argument C_Data is the element type; LVector (provided by Aurora) ts the body class. 
template <class C_Data> 


class GVReleaseC : public GVRWBehaviour<C_Data, LVector<C_Data> > /! is-a GVRWBehaviour 
{ 
public: 
GVReleaseC( GVector<C_Data, C_LV> & gv ) : /{ Original handle via GPortal of NewBehavourmacro 
GVRWBehaviour<C.Data, LVector<CData> >( gv ) 
{ allowUpdateBuf(); } // Construct to allow update buffers 


~“GVReleaseC(); 


// ...inherits operator[] and other methods... 
) ; // GVReleaseC (User) 


Figure 7. Interface for Release Consistency Scoped Behaviour: GVReleaseC 


154 Conference on Object-Oriented Technologies and Systems - June 16-20, 1997 USENIX Association 


USENIX Association 


6.3 Data Sharing Optimizations: Scoped Be- 


haviour Objects 


For the data sharing optimizations, the parent class 
GVScopedHandle extracts and maintains information 
about the internal state of a given GVHandle, as per the 
holds-a relationship (Figure 7). This functionality is an 
important part of implementing scoped behaviour. The 
partition and directory objects of the GVHand1le are not 
copied, thus reducing the construction costs of a scoped 
behaviour object. 


Class GVOwnerComputes, in its constructor, uses the 
extracted internal state to determine the address of the 
body object’s data. Therefore, GVOwnerComputes can 
return a C-style pointer from the appropriate type con- 
structor and from the overloaded subscript operator. As 
previously discussed, GVOwnerComputes also defines 
special functions to support easy iterating over the local 
data. 


Class GVRWBehaviour can, optionally, create a 
read cache for shared data and create update buffers 
to shared data (Figure 7). Classes that derive from 
GVRWBehaviour explicitly configure the caching and 
buffering options. The overloaded subscript operator in 
GVRWBehaviour retums an object of class GPointerRC, 
which is similar in concept to class GPointersc, but 
with two important differences. First, if the read cache 
exists and is loaded, then GPointerRc is configured 
to access data from the cache instead of from the re- 
mote body. Second, if the update buffers are enabled 
in GVRWBehaviour, then GPointerRC is configured to 
store updates in the buffer rather than initiate a remote 
memory access. GVRWBehaviour creates the buffers on 
demand. Depending on the configuration of the cache 
and buffers, GPointerRC will access shared data appro- 
priately. 


Therefore, the constructor of class GVReadCache calls 
the appropriate GVRWBehaviour methods to create and 
load the read cache. Thus, when the subscript operator 
for GVReadCache, which is inherited from the parent 
class, creates a GPointerRC object, it will always access 
the cache. GVReadCache also defines a type constructor 
to return a C-style pointer to the cache. 


Similarly, class GVReleasec calls the appropriate 
GVRWBehaviour constructor and enables the use of up- 
date buffers (Figure 7). Thus, when the subscript opera- 
tor for GVReleasec, which is inherited from the parent 
class, creates a GPointerRC object, it will always use 
the buffers. The destructor for class GVRWBehaviour 
makes sure all buffers are flushed. 


7 Extending the Library 


Within the class hierarchy, new data sharing optimiza- 
tions can be implemented. We consider a trivial but il- 
lustrative example. For example, a new class could both 
cache data for reading and buffer updates. The new class 
would derive from GVRWBehaviour. The new class’s 
constructor creates the read cache and also enables the 
update buffers. The GPointerRC objects created by the 
new class would always read from the cache and always 
buffer updates. By default, updates are also mirrored in 
the cache. Admittedly, this “new” data sharing optimiza- 
tion is easy to add because of the design and existing 
functionality of GVRWBehaviour and GPointerRC, but 
the basic techniques can be used for more complex addi- 
tions to the library. 

There are three main techniques for extending the li- 
brary of data sharing optimizations. The techniques can 
also be combined. 


1. New classes. Define new classes for partition, di- 
rectory, body, and pointer objects. 


Currently, only a block-distributed partition object 
is implemented. If a cycle-distributed object is 
required in the future, a new partition class could 
abstract the distribution details. Finally, as we have 
seen, Classes like GPointerScC and GPointerRC 
are useful for defining new memory access be- 
haviours. 


2. New methods. Inherit from a parent class, then add 
new scoped behaviour with new methods. 


For example, GVOwnerComputes adds new meth- 
ods for iterating over local data. 


3. Re-define methods. Inherit from a parent class, 
then re-define behaviour through constructors, the 
destructor, methods, operators, and type construc- 
tors. 


For example, GVReleasecC relies on its parent class 
for most of its functionality. GVReleasec merely 
configures the update buffers appropriately in its 
constructor. 


8 Performance 


To date, we have experimented with three Aurora pro- 
grams [Lu97]. The programs are matrix multiplication 
(Figure 2), a 2-D diffusion simulation, and Parallel Sort- 
ing by Regular Sampling (PSRS) [SS92, LLSt 93]. Re- 
cent performance results are shown in Table 3. Speedups 
are computed against C implementations of the same al- 
gorithm (or against quicksort in the case of the parallel 
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a 
Matrix Multiply 704 x 704 Fast Ethemet 
(175 sec. seq.) 


Slax Ole 
(65.8 sec. seq.) 


1526 x 1526, 32 time-steps 

(47.8 sec. seq.) 

1024 x 1024, 32 time-steps 
(20.3 sec. seq.) 


10 million keys 
(60.4 sec. seq.) 


2-D Dirasion 


PSRS 






Specdup 
4 PEs 











2 PEs 





Fast Ethernet 


Fast Ethernet 


Fast Ethemet 


Fast Ethemet 


6 million keys FastEthemet | 1.21 2.05 22 
(33.9 sec. seq.) 


Table 3. Aurora Programs on a Network of Workstations 


sort). In particular, the sequential implementations do not 
suffer from the overheads of either operator overloading 
or scoped behaviour. 


The distributed-memory platform used for these ex- 
periments is a cluster of PowerPC 604 workstations with 
133 MHz CPUs, 96 MB of main memory, and a sin- 
gle, non-switched 100 Mbit/s Fast Ethernet network. 
The software includes IBM’s AIX 4.1 operating sys- 
tem, AIX’s pthreads, and the MPICH (version 1.0.13) 
[DGLS93] implementation of MPI. 


Two trends can be noted in the performance results. 
First, for these three programs, additional processors im- 
proves speedup, albeit with diminishing returns. Second, 
as the size of the data set increases, the overall granularity 
of work, and thus speedup, also increases. 


Contention for the single network and a reduced gran- 
ularity of work can account for the diminishing returns 
for more processors with a fixed problem size. For exam- 
ple, since the read cache’s data requirements are constant 
per-processor, communication costs and network con- 
tention grows when replicating vector mB in matrix mul- 
tiplication. Communications costs under contention also 
account for the overheads in the parallel sort program, 
since the algorithm includes a key exchange. For the 2-D 
diffusion simulation, the granularity of a time-step before 
a barrier quickly falls to below one second as processors 
are added. Fortunately, if the problem size increases, the 
computation’s overall granularity also increases resulting 
in better absolute speedups. 


The performance of Aurora programs on this particu- 
lar hardware platform is encouraging, but there remains 
two important avenues for future work: different network 
technology and new scoped behaviours. An 155 Mbit/s 
ATM network has been installed on the platform, but it 
is not yet fully exploited by the run-time system. How- 
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ever, early experience indicates that the additional band- 
width and improved contention characteristics of ATM 
will benefit Aurora programs. Also, there is currently no 
overlap between communication (for reads) and compu- 
tation in the existing scoped behaviours. For simplicity, 
GVReadCache loads all of the data before allowing com- 
putation to continue. Using the techniques described in 
this paper, the library of scoped behaviours will be ex- 
tended to better hide the read latency of the distributed- 
memory hardware. 


9 Discussion and Related Work 


Distributed data sharing is an example of a problem do- 
main where per-object and per-context optimization flex- 
ibility is desirable. The data access behaviourofa shared- 
data object can change depending on the loop or program 
phase, so a single data sharing policy is often insufficient 
for all contexts. In general, optimization flexibility can 
be supported through compiler annotations or a run-time 
system interface, but scoped behaviour offers advantages 
in terms of engineering effort, safety, and implementation 
flexibility. 

Since Ivy [L188], the first DSM system, a large body of 
work has emerged in the area of DSM and DSD systems 
(for example, [BCZ90, BKT92, BZS93, SGZ93, JK W95, 
ACD* 96] ). Related work in parallel array classes (for 
example, [LQ92]) has also addressed the basic problem 
of transparently sharing data. 

Different access patterns on shared data can be opti- 
mized through type-specific protocols and min-time an- 
notations. Both Munin[BCZ90] and Blizzard [FLRt 94] 
provide protocols customized to specific data sharing 
behaviours. Run-time libraries, such as shared regions 
[SGZ93], SAM [SL94], and CRL [JKW95], associate 
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coherence actions with access annotations (i.e., function 
calls). Unlike Munin, Aurora does not require special 
compiler support and different optimizations can be used 
in different contexts. Unlike Blizzard, Aurora integrates 
the optimizations into the programming language to gen- 
erate custom code for different coherence actions, for 
added implementation and performance flexibility. Un- 
like function libraries, the automatic construction and 
destruction of scoped behaviour objects make it impossi- 
ble for the programmer to omit an annotation and miss a 
coherence action. 

Aurora’s handle-body object architecture and the asso- 
ciation of data movement with constructors and destruc- 
tors are inspired by the parametric shared region (PSR) 
mechanism of ABC++. However, there are some signif- 
icant differences between Aurora’s shared-data objects 
and PSRs. First, Aurora allows distributed vectors to be 
partitioned between different address spaces to improve 
scalability and to support owner-computes using multi- 
ple nodes. APSR has single home node, therefore shared 
data cannot be partitioned and owner-computes cannot be 
used within a PSR. Second, Aurora uses operator over- 
loading and pointer objects, which gives the system more 
flexibility to generate behaviour-specific code, and to op- 
timize the read and write behaviour of shared data sepa- 
rately. Aurora can also return C-style pointers to shared 
data under controlled circumstances. The data in a PSR is 
always accessed using C-style pointers, which 1s efficient, 
but it does not allow the system to selectively intervene 
in data accesses. Lastly, Aurora supports multiple writ- 
ers to the same distributed vector object, which can be 
important for performance [ACDt 96], while PSRs only 
allow a single writer. 


10 Concluding Remarks 


Researchers have explored a variety of different imple- 
mentation techniques for DSM and DSD systems. The 
Aurora DSD programming system is an example of a 
software-only implementation that uses data sharing op- 
timizations to achieve good performance on a set of par- 
alle] programs. 

What distinguishes Aurora from other DSM and DSD 
systems is its use of scoped behaviour as an interface to 
a set of data sharing optimizations. Scoped behaviour 
supports per-context and per-object flexibility in apply- 
ing the optimizations. This novel level of flexibility is 
particularly useful for incrementally tuning multi-phase 
parallel programs and programs in which different shared 
objects are accessed in different ways. The performance 
of Aurora is encouraging and future work will explore 
new data sharing optimizations and how they can exploit 
different network performance characteristics. 

Scoped behaviour can be implemented in standard 


C++ without special compiler support and it offers impor- 
tant safety benefits over typical run-time libraries. The 
technique appears to be a viable approach for supporting 
this form of optimization flexibility. 
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Abstract 


The Standard Template Library (STL) [15, 19] isa 
C++ implementation of the generic programming 
paradigm [16]. Unlike in typical container class li- 
braries, algorithms in this paradigm do not work 
directly on collection container objects. They work 
on iterators (access and traversal objects) exported 
by containers. Given N data types, M containers, 
and K algorithms as components of a software sys- 
tem, STL provides a mechanism — using C++ tem- 
plates and the generic programming paradigm - to 
reduce the possibly N + M « K implementations to 
N+M™M-+K implementations. 

Over the last decade significant research has been 
done in the area of object-oriented parallelism and 
a number of models, libraries, and language exten- 
sions have been proposed and implemented. C++ 
has been an important language for writing paral- 
lel libraries and for extending for parallelism. In 
this paper we discuss how the generic program- 
ming paradigm in STL can be used and extended 
to support parallel programming in a manner that 
allows good expressibility, code reuse, and extensi- 
bility of the library. We look at control and data- 
parallel abstractions. We also discuss how differ- 
ent strategies for work distribution in parallel algo- 
rithms can be supported in the spirit of generic pro- 
gramming. We describe the relevant abstractions in 
Coir<Futures> (23], our STL-based parallel C++ 
library for shared memory parallelism. 


1. Introduction 
The object-oriented paradigm has been successful in 


providing good expressibility, maintainability, and 
reuse of software systems. With the growing pop- 


ularity of the object-oriented paradigm, researchers 
in the parallel compiling and computing world have 
experimented with using this paradigm for paral- 
lelism. Using and extending C++ has been popu- 
lar (25] though a few systems based on Eiffel [14] 
and Smalltalk [10] have also been built. 


Generic programming [16] is a paradigm that ab- 
stracts concrete, efficient algorithms that can be 
combined with different data representations to pro- 
duce a wide variety of useful software. For in- 
stance, using this paradigm, a generic sorting al- 
gorithm can be instantiated to work with different 
aggregate data structures like linked lists or arrays. 
Originally developed in Ada and Scheme, such a 
library has been recently implemented in C++ as 
the STL (19, 15] and in Java [17]. STL has been 
adopted by the C++ ANSI standard committee [1]. 
Its success is inevitable as C++ programmers have 
discovered a new style of writing container class li- 
braries. 


As practitioners of object-oriented parallel pro- 
cessing, we believe that the generic programming 
paradigm adds a new dimension to building paral- 
lel libraries by enabling better reuse of sequential 
code written in the same paradigm, dy allowing the 
writing of extensible parallel programs, and pro- 
viding interesting and useful parallel abstractions. 
The Cotr<Futures> shared memory parallel sys- 
tem was designed with this belief. In the following 
paragraphs, we will see how abstractions for asyn- 
chronous parallelism and for data parallelism are 
supported in Coir< Futures> along the lines of the 
generic programming paradigm. 

This paper is organized as follows: In the next 
section we discuss generic programming and STL. 
In section 3. we describe our Coir<Futures> library 
briefly. In section 4. we discuss control parallel ab- 
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stractions through futures [9] and in section 5. we 
describe data parallel objects. In section 6. we dis- 
cuss data paraliel extensions to sequential STL al- 
gorithms. In section 7. we discuss related work and 
in section 8. we draw our conclusions. 


2. Generic Programming and 
the Standard Template Li- 
brary 


The generic programming paradigm prescribes four 
kinds of abstractions: data abstractions, algorith- 
mic abstractions, structural abstractions, and rep- 
resentational abstractions. 

Data abstractions are data types and sets of oper- 
ations on them. C++ provides templates as the nec- 
essary constructs for data abstractions. Templates 
provide a uniform interface and implementation ab- 
stractions for different data types. For instance, a 
template stack class can be instantiated to a stack 
of intégers, doubles, or any user-defined type. Thus, 
for N data types only one template container class 
is provided which can be instantiated N ways. 

STL provides implementations for the remaining 
three abstractions through STL algorithms, itera- 
tors, and adaptors. These are discussed below. 


Algorithms 


Generic algorithmic abstractions are families of data 
abstractions with a common set of algorithms. In 
order to make algorithms generic they are designed 
to work on iterators (see below) that are exported 
by containers. For instance, a sort algorithm could 
work on a linked list or a vector data abstraction if 
the list and vector collection classes provide itera- 
tor objects that mark the beginning and end of the 
container. Algorithms are implemented as template 
functions in STL, typically parameterized over iter- 
ators or structural abstractions. 


Iterators 


Iterators are implementations of structural abstrac- 
tions and are data type templates exported by con- 
tainer classes. Iterators are generalizations of array 
pointers for generic containers and provide opera- 
tors to traverse the range of data they point to and 
also operators to reference the element they point 
to. Typically, the pointer arithmetic operators like 
++ (auto-increment), —— (auto-decrement), +n 
(jump n positions forward), and —n (jump n posi- 
tions backward), are overloaded to provide traversal 
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implementations. They also overload the compar- 
ison operators (==, <, >, =) and the assignment 
operator (=) to compare iterator positions and al- 
low iterator assignments, respectively. The C++ * 
operator is overloaded to reference the element at 
the position pointed to by the iterator. Algorithms 
work over iterators rather than directly over con- 
tainers. Therefore the same algorithm can work for 
different container types as long as they export ap- 
propriate types of iterators. An additional advan- 
tage of having algorithms work on iterators instead 
of on containers is that algorithms can be used to 
work on a partial range of elements in a container. 

An iterator can be of one of the following kinds 
- input, output, forward, bidirectional, or random- 
access. Input iterators are data sources (e.g. the 
cin standard input object in C++), and output op- 
erators are data sinks (e.g. cout in C++). Forward 
iterators satisfy properties of both input and out- 
put iterators. Forward iterators can be traversed 
one position at a time only in the forward direction, 
hence they support only the ++ operator. Bidirec- 
tional iterators satisfy properties of forward itera- 
tors, can be traversed in both forward and reverse 
directions one step at a time, and support the —— 
operator. Random access iterators are bidirectional 
iterators, which can make non-unit jumps in the for- 
ward or reverse direction. They support the +n and 
—n operators too. 

Most container classes export member functions 
called begin() and end() which return iterators that 
point to the first element and past the last element, 
respectively, of the container object. Starting with 
these functions, and using the iterator traversal op- 
erators, users can construct iterators pointing to a 
subrange of the elements in a container. 


Adaptors 


Generic representational abstractions are mappings 
from one structural abstraction to another. Called 
adaptors in STL, these abstractions are casting 
wrappers that change the appearance of a container 
(building a stack from a list), or an iterator (con- 
verting a bidirectional iterator to a reverse iterator). 
STL also has adaptors to convert C++ I/O streams 
and arrays to STL-style containers. 


Functors 


STL also defines function objects (or functors) 
which are basically template function pointers 
wrapped in template classes. These classes provide 
a ’()’ operator which is used for invoking the func- 
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tion. STL also provides adaptors to convert normal 
C++ function pointers to function objects. 


2. 


STL defines a container template class called vec- 
tor. It also provides an algorithm called find which 
takes in three arguments -— two input iterators 
first and last indicating a range, and a value ar- 
gument val — and returns an iterator pointing to 
the first position in the range in which the ele- 
ment matches the value val. The following code 
builds a vector of integer elements, finds the posi- 
tion of the first zero element in the vector, and then 
the position of the next zero element in the vector. 


Example 


// create a vector with 10 elements 
vector<int> v(10); 

// add elements to the vector 
v.push_back(1); 

v.push_back(5); 


// point to the first element of the vector 

vector::iterator il = v.begin(); 

// the following assertion will be true 

assert(*il == 1 && *(i1+1) == 5); 

// point past the last element of the vector 

vector::iterator i2 = v.end(); 

// find the first zero occurrence 

vector::iterator first = find(il, i2, 0); 

if(first == 12) 
cout << “vector has no zeroes” << endl; 

else { 

// find the second zero occurrence 
vector::iterator second = find(first, 12, 0); 
if (second == i2) 

cout << “vector has one zero” << endl; 
else 
cout << “vector has two or more zeroes’ 
<< end]; 


> 


3. The Coir<Futures> System 


The Coir<Futures> library is our STL-based 
generic parallel programming library for shared- 
memory systems. It is built on top of a light-weight 
user-level thread library called Coir-Core [22] which 
supports standard thread operations for shared 
memory machines, and, in addition, includes sup- 
port for locality, affinity, and migration domains. 


Coir<Futures> recognizes control parallelism 
and data parallelism as two important models of 
parallelism. Control parallel abstractions provide a 
mechanism for specifying that one piece of computa- 
tion can proceed in parallel with and independent 
of another piece of computation. C'oi1r<Futures> 
supports control parallelism through future abstrac- 
tions which are built on top of thread constructs. It 
supports thread classes and their inheritance mech- 
anism. It has support for monitor-style program- 
ming and it separates thread-level operations from 
processor-level operations. 

Data parallelism is a powerful and most com- 
monly provided and used construct for parallelism, 
especially in scientific parallel computing. In a 
data parallel model, multiple processors/threads 
perform a common computation but each proces- 
sor/thread operates on different data. In languages 
that have functions as first class data types, func- 
tional data parallelism can be defined by repre- 
senting the common computation by a function or 
a function pointer. Traditionally data parallelism 
has been provided as a processor-level abstraction. 
The most common style is SPMD (Single Program 
Multiple Data) parallelism [6] where each proces- 
sor executes the same program but based on the 
processor id each one executes it on a different 
section of the data. The program computation 
is Interspersed with synchronization and data ex- 
change. Coir<Futures> supports functional data 
parallelism at the thread level through the abstrac- 
tions of thread groups called ropes [21]. Ropes are 
powerful data parallel objects which provide a scop- 
ing mechanism for non-blocking, multithreaded, 
and interleaved data parallelism, deviating from the 
popular but restricted SPMD-style data parallelism. 
Coir<Futures> has template facilities to customize 
these objects based on the function signatures. 


4. Futures for Control Paral- 
lelism 


Futures [9] are place-holders or "IOUs’ for values 
or computations. Future abstractions are useful 
in parallel programming because they allow asyn- 
chronous computation by enabling delaying the 
computation of the value of an object until the value 
is required. Good abstractions for futures are use- 
ful in writing parallel programs that look similar 
to their sequential counterparts. Coir<Futures> 
supports two kinds of futures — data futures and 
computation futures. Data futures are simple place- 
holders for values. The futures are resolved by some 
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arbitrary computation entering a value into the fu- 
ture. Computation futures are those that represent 
a delayed computation (execution of a function). 
The value of a computation future is the value re- 
turned by the function that represents the computa- 
tion. Data futures are templatized on the base types 
of the values they represent. Computation futures 
are templatized on the signature of the computa- 
tions (functors) they represent. Since the return 
type of a function is a part of its signature, the base 
type of the value that the future represents is hid- 
den in the template parameter. Future classes also 
support a cast operator to their base types. The 
cast operator is a blocking operator and waits for 
the future resolution i.e., for the future to have a 
valid value. 


Functor 


4.1. Computation Future 


Adaptors 


The programming paradigm of the STL goes very 
well with the future abstraction described above. 
For instance, future objects can be created out of ar- 
bitrary function objects. STL has adaptors to con- 
vert N-ary function pointers to N-ary function ob- 
jects where N = 1, 2. Function objects and adapters 
for N = 3, 4,...can be easily built. Coir<Futures> 
defines adaptors that convert N-ary function point- 
ers to future N-ary function objects. This is done 
using the following two constructs ?: 


1. A future_pointer_to.N-ary function template 
class for every N-ary function that provides 
the necessary type definitions and member 
functions for future-based operations. 


2. A future_ptr_fun template function that is an 
adaptor from an N-ary function pointer to an 
instance of a future_pointer_to_N-ary_function 
class. 


4.1.1. future_pointer_to_N-ary_function 


This template class is defined as follows: 


template <class Argl, class Arg2, ..., class ArgN, 
class Result> 
class future_pointer_to_N-ary_function 
: public N-ary_function<Argl, Arg2,..., ArgN, 
Result > 


1 We have freely used N-ary in the code fragments to 
stand for unary, binary, ternary, .... The actual system has 
separate constructs for each of these cases. 
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protected: 
pointer_to_N-ary function<Argl, Arg2, ..., ArgN, 
Result> Nf; 
public: 
typedef TaskN<Argl, Arg2, ..., ArgN, 
Result>::future_type future_type; 
future_pointer_to.N-ary-_function( 
Result (*f)(Arg1, Arg2, ..., ArgN)) 
: Nf(ptr_fun(f)) 
as 


~future_pointer_to_N-ary function() 
{ ...clean up task lists... } 
future_type operator()(Arg1l, Arg2, ..., ArgN); 


}; 


This template class has N + 1 parameters — N 
parameters (Arg!, ..., ArgN) for the N argument 
types of the N-ary function this corresponds to, 
and the last parameter(class Result) for the result 
type of the N-ary function. 

This class is defined as a subclass of the N- 
ary _function template class provided in STL. (Note 
that STL defines only unary and binary function 
template classes. Ternary, quaternary function tem- 
plate classes etc. can be easily defined.) 

The template class defines/exports a public type 
definition — future_type. This is the type of the han- 
dle that is returned when a ’()’ operator of an object 
of this class is invoked (see below). The implemen- 
tation of future_type should ensure that an object of 
future.type should be castable to Result type. 

The constructor of the template class takes one 
argument which is an N-ary function pointer. 
It converts this N-ary function pointer into a 
pointer_to_N-ary function object using the pir_fun 
adaptor provided in STL and stores it in its member 
defined by Nf. 

The template class TaskN stands for a thread or 
a task implementation class that exports a future 
template data type. One requirement on this fu- 
ture data type is that it has a cast operator that 
can be used to convert it to Result type. In a typ- 
ical implementation the future will have fields that 
mark the status of the underlying task and whether 
the future has been resolved. The cast operator will 
check this field to see if the task underlying the fu- 
ture has been scheduled and completed. If it is not 
completed it will block the current thread for the 
underlying to complete and then resolve the future 
to the value returned by the task. If the task is 
completed before this cast operator is invoked, the 
return value is saved and the future is marked to 
have a valid value. In this case, the cast opera- 
tor returns Immediately with the saved value as the 
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value of the future. 

The future_pointer_to_N-ary_function class also 
supports a ’()’ operator which is basically an invo- 
cation operator of the future function object. It ex- 
pects N arguments and creates a task of type TaskN 
with the N-ary function objects and these N argu- 
ments and returns a future_type object. The created 
task is added to an internal list of tasks. When the 
destructor of the future_pointer_to_N-ary function 
object is invoked (when the object is deleted or goes 
out of scope), this internal list and the tasks in that 
list are deleted and disposed. 


4.1.2. future_ptr_fun 


The adaptor future_ptr_fun is used to convert an or- 
dinary N-ary function pointer to a future function 
object as described above. It is defined as follows: 


template <class Argl, class Arg2, ..., class ArgN, 
class Result> 
future_pointer_to_N-ary_function< 
Argl, Arg2, ..., ArgN, Result> 
future_ptr_fun(Result (*x)(Argl, Arg2, ..., 
ArgN)) 
{ 


return future_pointer_to_N-ary-_function< 
Argl, Arg2,..., ArgN, Result>(x); 


This template function also is parameterized on 
N +1 parameters - N argument types and one re- 
sult type. The function itself takes one argument 
which is an N-ary function pointer, builds a fu- 
ture_pointer_to_.N-ary function object (passing this 
function pointer to its constructor), and returns this 
object. Figure 1 shows how this adaptor behaves. 


4.1.3. Example 


The following piece of code is an example of how 
the user of this library can use this facility. This 
code shows how a unary function pointer can be 
’adapted’ to a future unary function object and 
used. 

Suppose int (*foo)(int) is a unary function 
pointer that expects an integer argument and re- 
turns an integer result. It can be converted to a 
future unary function object and used as follows: 


future_pointer_to_unary_function< 
int,int>::futuretype fval 
= (future_ptr_fun(foo))(4); // (A) 









Result (“)(Argl, ... ArgN) 





future_ptr_fun<Argl, ... ArgN, Result> ' 






future_pointer_to_Nary_function<Argl, ... ArgN, Result> 


Figure 1: The Future adaptor function template behav- 
ior: It takes a function pointer for an N-ary function 
and returns a future N-ary function object. 





// (B) 
// (C) 


// do some unrelated work 
int final = fval + 3; 


The code executes as follows (see figure 2 for a pic- 
torial representation of the execution of this code): 


In the statement labeled (A) the following se- 
quence of operations takes place: The unary 
function pointer foo is converted into a future 
unary function object with the use of the adap- 
tor future_ptr_fun. Then the operator ’()’ is in- 
voked on the resultant future unary function 
object with argument 4. This results in a task 
being scheduled to compute foo with argument 
4 and a future handle is returned. 


In the statement(s) labeled (B) the user code can 
perform computations that do not need the re- 
sults of the foo function described above. 


In the statement labeled (C) the result of the foo 
function is required and the user code adds 9 to 
this result and assigns it to final. Note that fval 
is of afuture type while 91s of type int. Since + 
is only defined between similar types (and not 
between future and int types) this statement 
would be valid only if there was a way to go 
from the future type to its Result type (i.e., int 
type). As mentioned before, every future type 
has to provide a cast operator to its Result type. 
In the implementation of the cast operator, the 
current thread waits for the task corresponding 
to this future resolution to finish and returns 
the value returned by the task. 
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Figure 2: The execution pattern of the example given 
in page 5: The statement labeled (A) creates an tnt 
future object out of the foo function pointer by first 
creating a future_ptr_to_unary_functron object using the 
future_ptr_fun adaptor and schedules it with argument 
4. Statement (B) represents an unrelated computation. 
Statement (C) requires the future to be resolved as a 
result of the addition operation between the tnt future 
object and an integer constant 3 to produce ’final’. 


4.2. Future-based Algorithms 


Typically parallel data abstractions are built around 
arrays or collection types. Since multiple processors 
and address spaces may be involved, the abstrac- 
tions are for their distribution, access, and update 
across Memory and processor spaces. Iterators pro- 
vide a way to traverse aggregate data types and 
to reference and update particular elements. STL 
algorithms work using iterators (instead of work- 
ing on containers directly), iterators are provided 
by containers, and containers are template classes. 
Just as containers are instantiated on regular C++ 
data types, they can be instantiated on future type 
instantiations. Because iterator traversal and ac- 
cess operators do not assume anything about the 
data type contained in the containers, the same pro- 
grams that work for containers of basic data types 
can work for containers of future data types. Asyn- 
chronous and delayed computations come for free, 
as values are not required until the point where the 
dereferencing * iterator operator is used to access 
the value of the base type. When the result type 
of the use of the * operator resolves into a base 
type, future resolution takes place. Thus sequential 
STL applications can be easily adapted to future- 
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based parallel applications. Future-based iterators 
and containers can be assigned, copied, and passed 
as arguments to the point where their values are 
accessed. This allows us to write programs where 
delayed evaluations can be exploited across proce- 
dure boundaries. 


4.2.1. 


The STL find algorithm looks for the first element 
in a list with value = value. It is written as follows: 


Example 


template <class InputIterator, class T> 

InputIterator find(InputIterator first, 
InputIterator last, 
const T& value) 


while (first ! = last && *first ! = value) 


{ 
+-+first; 


} 


return first; 


} 


The following code fragment builds a list of future 
computations and executes find() over this list to 
find the first element in the list with a zero value. 


int foo(const int i); 

typedef Future<int> future_type; 

typedef list<future_type> list_type; 

list_type xlist; 

for(int i = 0; i < N; i++) 
xlist.push_back(future_ty pe(foo, i)); // add to list 


list_ty pe::iterator pos 
= find(xlist.begin(), xlist.end(), 0); 


Here find computes only to the point where it finds 
an element with a value = value. The remaining 
futures need not even be computed or resolved. 


4.3. Future Implementations 


Currently Coir<Futures> implements futures us- 
ing an extended form of the leapfrogging mecha- 
nism discussed in [24]. This mechanism combines 
leapfrogging with task stealing to achieve load bal- 
ancing. Since this paper concentrates more on the 
generic programming interface aspect of the system, 
we do not describe the implementation details. It 
may be noted that due to the object-oriented nature 
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of the system different implementations (depending 
on the underlying thread system and the hardware 
environment) may be supported without changing 
the interface. 

There may be situations (like the one described 
in the example in the previous section) where fu- 
ture objects are created but are never resolved. 
The semantics and our implementation guaran- 
tees that the future value is made available at the 
point where it is required. However, futures may 
be scheduled and computed even if their values 
are not required. Consider the following example: 


future_pointer_to_unary_function< 
int,int>::future_type fval1 
= (future_ptr_fun(foo))(4); 
future_pointer_to_unary function< 
int,int>::future_type fval2 
= (future_ptr_fun(foo))(5); 


int x = fvall; 
.. £val2 is never resolved... 
...X 1S never used... 


This program creates two future computations fvali 
and fual2. fual2 is never resolved while fvali is re- 
solved to an integer variable z, but the resolved vari- 
able z is never used. Most optimizing C++ compil- 
ers which performs extensive flow analysis may be 
able to eliminate dead code of this kind. In com- 
piler terms, z and fual2 are dead variables and their 
computation may be eliminated if other conditions 
like absence of side effects in their computation are 
satisfied. Coir<Futures> does not make any spe- 
cial effort to avoid scheduling dead futures. We rely 
on the C++ compiler for such optimizations. The 
compiler may not be able to eliminate all cases of 
dead futures. For instance, the ones in the example 
in the previous section are embedded in a list and 
are harder for the compiler to identify. 


5. Data Parallel Objects 


In our future based environment, data parallel ob- 
jects are schedulable objects not restricted to SPMD 
programming. Further, multiple data parallel com- 
putations can interleave. These data parallel ob- 
jects can be thought of as future objects used to 
schedule data parallel operations where each thread 
operates on a different piece of data. The future 
resolution is a reduction of the values returned by 
each of the parallel functions. Just as the under- 
lying control parallel component in Coir<Futures> 


is a thread, the underlying data parallel component 
is a rope or a group of threads. Rope objects de- 
fine functional data parallelism and synchronization 
among threads. A global rope is defined from the 
main threads of all the processors participating in 
a parallel program. The following are some of the 
functions provided by the base Rope class: 


1. static Rope& SelfRope() - identify the cur- 
rently executing rope 


2. int Size() const - number of threads in a rope 


3. int SelfIndex() const - index of the currently 
executing thread in its rope 


4. int Index(const Thread& thr) const - index of 
the thread ’thr’ in the rope 


5. Thread& operator|](const int index) const - the 
*index’th thread in the rope. 


6. Reduction& ReductionObj() const - the reduc- 
tion operation skeleton in the rope (see below 
for description.) 


Derived rope classes can be templatized on type sig- 
natures of the the data-parallel tasks they would ex- 
ecute. Coir<Futures> defines UnaryRope and Bi:- 
naryRope derived template classes where the tem- 
plate parameters specify that the tasks are repre- 
sented by unary and binary function object types, 
respectively. 


5.1. Reduction 


An important aspect of data parallelism is the re- 
duction operation where each thread contributes a 
value and the values are reduced using a function 
to obtain and return a reduced value to each of the 
threads. Reduction can be defined follows: 

Let S, be the set of all elements of type r. Let 
@(S,) denote the set of all finite subsets of S,. Let 
S@ be the set of all commutative and associative 
binary operators/functions defined over elements of 
type 7. Then a reduction operator, p, is defined 
from Sg x @(S,;) —+ S,. A reduction operation is 
the application of an operator © € SQ to a set of 
elements of type 7 and returns a result of type rT. 
Operationally, if S, is the set {a,,a2,a3,...,an}, 
then 


p(®, Sz) = @(a1, @(a2, @(a3,.-- ® (Gn—1, Gn)))) 


is one way of applying the operator. 
As an example, let +t be the int type, and 
S,; be instantiated to a vector v of type v < 
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int > with 10 elements {4, 5, 7, 3, 2, 10,5, 8, 0, 12}. 
Let © be the addition operator +. Since + 
Is assoclative over the set of integers, it qual- 
ifies as a reduction operation and p(+,v) = 
+(4, +(5, +(7, +(3, ...,+(0,12))...))) = 56. 

The reduction operation itself is a parallel op- 
eration that can be done with O(logN) complexity 
using a tree-style operation. There are three aspects 
to reduction: 


1. The data types of the individual values con- 
tributed by each of the threads. 


2. The computation pattern of the reduction op- 
eration. 


3. The reduction operator or function that is used 
to reduce the values. This should be an asso- 
ciative function (i.e., f(f(x,y),z) = f(x, f(y,z))). 


It may be noted that the reduction computation 
pattern is independent of the data types of the val- 
ues contributed. It just depends on the amount of 
parallelism that is available in terms of the number 
of processors and threads. Tree reductions are effec- 
tive in reducing the parallel complexity. However, 
doing the reduction operation this way in parallel 
may demand that the reduction operator not only 
be associative but also be commutative (i.e., f(x,y) 
= f(y,x)). 

Since the computation pattern 1s independent of 
the data types of the values and the reduction op- 
erator, a computation pattern skeleton can be built 
at the time of rope creation. This skeleton can be 
used and reused for different reduction operations. 
The way Coir<Futures> defines the computation 
pattern, it has the following properties: 


1. Reduction operations are rope-specific. Thus 
reduction operations belonging to different 
ropes are non interfering. 


2. Two different reduction operations within the 
same rope are non-interfering. This 1s ensured 
by defining this skeleton to consist of two trees 
- a fan-in tree and a fan-out tree. 


5.1.1. The Reduction Skeleton 


The reduction skeleton consists of a fan-in tree and 
a fan-out tree. 

The fan-in tree has N nodes, where N is the num- 
ber of threads in the rope. Each node is identified 
by a distinct thread index (0 to N —1). During the 
fan-in reduction computation the reduction opera- 
tion takes place in a bottom up fashion - starting 
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at the leaf and going to the root. At the end of this 
the root has the reduced value. 

The fan-out tree also has N nodes and the nodes 
are identified by thread indices. The fan-out phase 
is a broadcast phase where the reduced value is 
broadcast to each individual thread in a top-down 
fashion - starting at the root and going to the leaf. 

In reality there is only one tree as specified in 
the Reduction class, only the traversals specified by 
fan-in and fan-out are different. 


The Reduction Class 


The Reduction tree class description 1s parameter- 
ized so that it will work for any tree which is unary 
to (N-1)-ary 7, and also so that the fan-in and the 
fan-out trees are of different ranks (i.e., one can have 
a binary fan-in tree and a quaternary fan-out tree.). 
Past research has shown that the ranks ofthe fan-in 
and fan-out tree have a significant effect on parallel 
synchronization performance, and should be deter- 
mined based on the architecture, number of pro- 
cessors, and memory hierarchy of the parallel sys- 
tem [13]. 


template <class SizeType> 
struct FanInNode 


typedef SizeType fanin_size_type; 
enum { fanin-size = sizeof(fanin_size_type) }; 
bool ith_fanin_child_exists(const int i); 


— 


template <class SizeType> 
struct FanOutNode 


typedef SizeType fanout_size_ty pe; 
enum { fanout_size = sizeof(fanout_size_type) }; 
bool ith_fanout_child_exists(const int i); 


b 


struct Reduction 


{ 


struct MyNode: public FanInNode<short>, 
public FanOutNode<int> 
{ 


i 


typedef MyNode node_type; 


2read (N-1)-ary as (N minus 1)-ary 
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Reduction(const int size); // constructor 
virtual ~Reduction(); // destructor 


// number of threads participating 

int Size() const; 

// get the node corresp. to the ith thread 
node_type& get_node(const int i); 

// get this thread’s node 

node_type& get_my-_node(); 

// index of the fanin parent of the ith thread 
static int fanin_parent(const int i); 

// index of the fanout parent of the ith thread 
static int fanout_parent(const int i); 

// index of the kth fanin child of the ith thread 
static int fanout_child(const int i, const int k); 
// index of the kth fanout child of the ith thread 
static int fanin_child(const int i, const int k); 


} 


Implementation Details 


The class FanInNode defines the fan-in node prop- 
erties. It defines a type fanin_size_type. For a fan- 
in size of k this is a C++ data type whose size 
(as given by the C++ sizeof operator) is k bytes. 
For instance, the code defines fanin_szze_type to be 
a short to specify a fan-in size of 2 (given by the 
field fanzn_size), since short is 2 bytes in many im- 
plementations * This class also defines a boolean 
query member function, which when given the index 
1 says whether or not it has an ith child (note that 
any node in the fan-in tree has at most fanin_size 
children). 

The class FanOutNode defines the fan-out node 
properties. It defines a type fanout_size_type. For a 
fan-out size of k this is a C++ data type whose size 
(as given by the C++ sizeof operator) is k bytes. 
For instance, the code defines fanout_size_type to be 
an int to specify a fan-out size of 4 (given by the 
field fanin_size), since int is 4 bytes in many imple- 
mentations. This class also defines a boolean query 
member function, which when given the index 2 says 
whether or not it has an ith child (note that any 
node in the fan-out tree has at most fanout_size 
children). 

The four functions fanin_parent, fanout_parent, 


3It may seem that making assumptions about sizes of ints 
and shorts is against the object-oriented paradigm, it is im- 
portant for obtaining good performance. tints and shorts can 
be used in conditional expressions and can be checked using 
a single scalar compare operator. Since the fields on which 
the condition checks are made are prone to shared-memory 
bottlenecks such optimizations contribute to significant per- 
formance improvements. 


fanin_child, and fanout_child are static because 
these functions depend only on the fanin_size and 
fanout_size quantities which are constants and the 
arguments passed to these functions. This permits 
the C++ compiler to perform inlining optimiza- 
tions. 


5.1.2. Type-specific Reduction 


In a data parallel operation involving N threads 
of a rope, the threads can participate in a type- 
specific reduction. As discussed above each rope 
object has the skeleton of a tree-based reduc- 
tion operation. The threads in a rope can en- 
ter type-specific reductions by cloning this reduc- 
tion skeleton to a reduction object for that type. 
To achieve this Coir<Futures> defines a Reduc- 
tzonT template class parameterized on the type 
of the value the threads contribute to a reduc- 
tion operation. This class is defined as follows ?: 


template <class T, class ReducerType 
= binary_function<T,T, T> > 
class ReductionT 


public: 
typedef ReducerType reducer_type; 
typedef T data_type; 
ReductionT(Reduction& my_red 
= Rope::SelfRope().ReductionObj()); 
~ReductionT(); 
// reduction operator 
T operator()(reducer_type reducer, 
const T& data); 


}; 


5.2. Using Reduction Objects 


The user code uses the reduction facility as given 
below. Each thread does the following: 


1. Obtain the reduction tree skeleton object cor- 
responding to this thread’s rope. 


2. For each type T for which reduction operation 
is to be performed, create a type-specific per- 
thread reduction object. 


*The declaration shown here uses template parameter 
constraint where ReducerType is specified to be of type bi- 
nary-function. This is only for documentation purposes. Our 
compiler did not support specifying template constraints at 
the time of this implementation. 
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3. For each reducer (binary commutative and as- 
sociative operator for type T), invoke the ’()’ 
operator of the ReductionT object. 


The following piece of code shows a sum reduc- 
tion followed by a product reduction on integer 
types. It is executed by each thread in the rope. 


// build the type specific reduction object 

ReductionT<int, binary _function<int, int, int> > 
red_obj( Rope::SelfRope().ReductionObj()); 

int my_contrib =... 

// sum reduction 

int my.sum = red_obj(plus<int>, my-contrib); 

// product reduction 

int my_prod = red_obj(times<int>, my_contrib); 


where ReducerType is a type of a binary function 
object which expects two arguments of type con- 
vertible to type JT, and result type is a type con- 
vertible to type T. The constructor expects a Re- 
duction object as an argument which is typically 
Rope::SelfRope().ReductionObj(). The class also ex- 
ports a type called reducer_type which is the same 
as the actual argument for the template parameter 
ReducerType, and a type data_type which is basi- 
cally type T. The () operator takes two arguments 
- reducer is a binary commutative and associative 
function object that is used in the reduction, and 
data is the contribution of the thread to the re- 
duction operation. The () operator performs the 
actual reduction - each thread participating in the 
reduction operation invokes the () operator while 
in a data parallel computation. Note that all the 
threads should specify the same reducer_type argu- 
ment. Making it a part of the () operator allows 
us to reuse the same reduction object for a different 
reduction operation easily. 


6. Data Parallel Algorithms 


STL-provided algorithms and typical user-written 
algorithms use one or more iterators as arguments 
and traverse, build, and update containers using 
these iterators. In providing parallel implementa- 
tions of these algorithms it would be useful to de- 
sign the implementations in such a way that differ- 
ent strategies of work distribution can be made a 
part of the algorithm implementation. In the se- 
quential world, STL reduces N*M*K possible im- 
plementations for N data types, M containers and 
K algorithms to N+M+K implementations. In the 
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parallel world, given P strategies of work distribu- 
tion, we would like to extend STL so that N*M*K*P 
implementations are reduced to N+M+K-+P imple- 
mentations. This can be done with the support 
of per-thread parallel iterators and strategy classes 
discussed below. 

Data parallel versions of sequential algorithms 
have a group of threads participating in the algo- 
rithm. The parallel implementation of a sequen- 
tial algorithm is a template function with one tem- 
plate parameter in addition to its sequential coun- 
terpart. This parameter represents the Strategy 
class. The parallel algorithm function accepts an ex- 
tra parameter as compared to the sequential coun- 
terpart. This parameter is the strategy object which 
is an instance of the Strategy class. In the body of 
the parallel algorithm, the strategy object is used 
to convert each of the sequential iterators to per- 
thread parallel iterators. The per-thread parallel 
iterator traverses the container in such a way that 
it touches those parts of the container for which the 
thread that this iterator corresponds to is respon- 
sible. When each thread has computed a partial 
result the results are composed through a reduction 
operation.(See figure 4 and compare with figure 3). 
(Note that all STL algorithms cannot be written 
exactly this way. Mainly because the composition 
of the partial results may not be an associative or 
commutative operation for a particular style of work 
distribution.) 


6.1. Per-thread Parallel Iterators 


STL iterators are inherently sequential in nature be- 
cause each iterator defines a single cursor of traver- 
sal and update. This is inadequate for parallel pro- 
gramming. Wecan define per-thread iterators based 
upon the strategy used for accessing and traversing 
the iterator space by the threads taking part in a 
data parallel operation. 


6.2. Strategies 


Strategies are data types that specify how work is 
distributed over the threads participating in a data 
parallel algorithm. As algorithms typically operate 
over iterators, strategy classes convert sequential it- 
erators to per-thread parallel iterators. Conceptu- 
ally, strategies define iterator adaptors. A strategy 
class is templatized on iterator types, value types, 
reference types, and distance types. It supports op- 
erators that, given sequential iterators, return per- 
thread parallel iterators. Strategies may be static 
or dynamic. 
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Sequential iterators i1, i2, i3 Other input 


Sequential Algorithm 





Output 


Figure 3: Typical forms of sequential algorithms in STL 
style: The sequential algorithm takes in some iterators 
(the one in the figure takes 3) and other input param- 
eters, traverses the iterators sequentially and produces 
an output. 


Static Strategies 


Static strategies specify ‘compile-time’ work distri- 
bution where the region of the iterator space that a 
thread works on is decided based on the thread in- 
dex, number of threads, and the size of the iteration 
space. Block and Cyclic strategies are examples of 
this strategy type. Given an iteration space with 
N values, with indices 0 to N — 1, over which T 
threads operate, the iteration space of the zth thread 
in the Block strategy is given by { x, x te Diccay 
NG+) — 1} where 0 <i <T. The iteration of the 
ith thread in the Cyclic strategy is given by { 3, 
i+T,i4+2T,...,i+ (2-1) #T}. 


Dynamic Strategies 


Dynamic distribution strategies can be used for dy- 
namic load-balancing. Here the traversal pattern of 
the per-thread iterators is decided dynamically. An 
example is the Grabstrategy. Here all the threads in 
the rope operate over the original iterator space; the 
iterator traversal is monitor-based; only one thread 
can be doing an iterator operation at any time; and 
this operation affects any subsequent iterator oper- 
ation by any other thread. Traversal of the iteration 
space is determined by the work load on the threads 
and the processors on which they execute and may 
be different between runs of the same program. 
Figure 5 shows an example of the sequential 


Other input 


Sequential iters il, iZ, i3 







Parallel Algorithm 


Output 


Figure 4: In a typical parallel algorithm strategy ob- 
jects are used to convert the sequential iterators to per- 
thread parallel] iterators which are used in the partial 
algorithms by each thread. The results computed by 
each of the threads in the partial algorithms are com- 
posed to produce the final output. 


traversal and parallel traversal in Block, Cyclic, and 
Grab strategies. 


6.2.1. Requirements of a Strategy Class 


The strategy class should export a type definition 
for a corresponding per-thread iterator for one or 
more STL sequential iterators. Providing a per- 
thread iterator for each of the sequential iterators 
makes a strategy class more usable in the sense that 
it can possibly be used to parallelize all STL-based 
sequential algorithms. 

Many algorithms (e.g., for_each, count, find) in 
STL work with iterator pairs - to indicate the begin- 
ning and end of a sequence. They might also have 
a third iterator that follows the traversal between 
the first two iterators. To help the parallelization 
of such algorithms a strategy class may provide two 
()’ operators: 


1. operator()(const int size, iterator_type be- 
gin, iterator_type end, thread_iterator_type& 
thr_begin, thread_iterator_type& thr_end); This 
operator basically returns the per-thread iter- 
ators given a pair of sequential iterators. The 
size parameter indicates the number of partic- 
ipating threads. This operator may be defined 
for one or more of the iterator types. 
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Cyclic Strategy 
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Parallel lterators w/ 
Grab Strategy 


Endl, End2, End3 


Beyini 


Begtn2 Begin3 


Figure 5: Sequential iterators and the corresponding 
per-thread parallel iterators with 3 threads for Block, 
Cyclic, and Grab Strategies: For each of Begin and 
End sequential iterators there are 3 per-thread itera- 
tors. Their traversal pattern depends on the strategy 
used. 


2. operator()(const int size, iterator_typel 
begin, tterator_typel endl, itera- 
tor_type2 begin2, thread_iterator_type1 & 
thr_begin1, thread_iterator_typel€S thr_end1, 


thread_iterator_type2& thr_begin2 ); In addition 
to returning per-thread iterators corresponding 
to ’beginl’ and ‘endl’, this also returns the 
per-thread iterator corresponding to iterator 
*begin2’ that depends on ’begin1’ and ’end1’. 


In many algorithms the traversals from begin 
and begin2 typically happen in sync and make some 
assumptions about the relative positions. So the 
corresponding per-thread iterator should not lose 
this relation information. Static strategies can sup- 
port iterators that retain this relationship because 
the traversal paths of the per-thread iterators are 
known. But dynamic strategies like the Grab strat- 
egy cannot guarantee this because the scope of 
the monitors used to control the traversal are per- 
iterator and using monitors over two iterator traver- 
sals would mean that we cannot reuse the sequen- 
tial algorithm. Hence, while the parallel versions of 
some algorithms like swap_ranges and transform for 
static strategies are simple extensions of the sequen- 
tial ones, the ones for the dynamic strategies may 
be quite different. 

There are other algorithms in which even differ- 
ent static strategies might need different implémen- 
tations. This depends on how the per-thread par- 
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tial results are composed to produce the final re- 
sults. For instance, the remove algorithm which re- 
moves elements of a particular value from a list for a 
Cyclic work strategy cannot compose the updated 
list by merely appending them in a reduction op- 
eration since this would put back the elements out 
of order. However, this can be done for the Block 
strategy. 

Thus, it may not be possible to provide the same 
algorithm implementation for two different strategy 
classes for efficiency or correctness reasons. For this 
reason we introduce strategy tags and strategy cate- 
gories similar to iterator tags and iterator categories 
in STL. 


6.2.2. Strategy Categories and Tags 


Coir< Futures> strategy classes are derived 
from two base classes - static_strategy and dy- 
namic_strategy. These two classes are empty 
classes (similar to input_iterator, output_iterator, 
etc., classes). Also we define a strategy tag 
corresponding to each strategy class (similar to 
iterator tags in STL) which can be used to provide 
implementations specific to different strategies. 
The strategy tag for any strategy class S$ can 
be obtained from the strategy_category function: 


S_category_tag strategy category(const S& strategy) 


return S_strategy_tag(); 


If an algorithm, say transpose, has_ differ- 
ent implementations for the Block and Cyclic 
strategies (either for the reasons of efficiency 
or for correctness) it can be implemented as: 


template <class Strategy, class InputIterator> 

Outputlterator transpose(Strategy&. strategy, 
Inputlterator first, 
InputlIterator last) 

{ 

return transpose(strategy, first, last, 
strategy_category(strategy )); 
} 


template <class Strategy, class InputIterator> 

OutputIterator transpose(Strategy& strategy, 
InputlIterator first, 
InputIterator last, 
block.strategy_tag btag) 
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... code specific to Block strategy ... 


} 


template <class Strategy, class InputIterator> 

OutputlIterator transpose(Strategy& strategy, 
InputIterator first, 
InputIterator last, 
cyclic_strategy_tag ctag) 


... code specific to Cyclic strategy ... 


} 


6.3. 


STL provides a count algorithm for counting the 
number of elements in an iterator range first 
and last that have value = value, and adds 
the result to the argument n. The following 
piece of code shows a sequential implementation: 


Example of a Parallel Algorithm 


template <class InputIterator, class T, class Size> 
void count(Inputlterator first, 

InputlIterator last, 

const T& value, 


Size& n) 
while (first != last) 


if(*first++ == value) 
++0; 


The following algorithm provides a parallel im- 
plementation of the same using strategy classes. 
This assumes that each thread participating in a 
data-parallel operation invokes this function and 
provides its own local copy of n. The imple- 
mentation uses the sequential version of count 
where each thread computes the partial results. 


template <class Strategy, class InputIterator, 
class T, class Size> 
void count(Strategy& strategy, 
InputIterator first, 
InputIterator last, 
const T& value. 


Size& n) 


//get the current data parallel object 
Rope& self_rope = Rope::SelfRope(); 


//get the rope size 

int rope_size = self_rope.Size(); 

//create a type-specific reduction object 

ReductionT<Size, plus<Size> > 
red(self_rope.ReductionObj()); 

//create thread-specific iterators 

Strategy::thread iterator my_first(first); 

Strategy::thread_ iterator m y_last(last); 

// position the iterators based on strategy 

strategy(rope_size, first, last, my-first, my_last); 

Size myn = 0; 

// per-thread sequential algorithm 

::;count(my-first, my_last, value_pred, my-n); 

// combine the partial results 

Size total = red(plus<Size>(), myn); 

n += total; 


7. Related Work 


One of the early works in C++ related to task par- 
allelism is the AT&T task library [20]. Early thread 
libraries in C++ include Presto [4] and the Brown 
thread library [7]. While these systems are shared 
memory implementations, the ACE system [18] is a 
pattern based [8, 5] C++ implementation for dis- 
tributed systems. A recent compilation of arti- 
cles [25] discusses a number of parallel C++ sys- 
tems. Most of the systems discussed in the col- 
lection extend the C++ language to support par- 
allelism. ABC++ and the Amelia vector library 
are two systems which provide parallelism within 
C++. ABC++ has the future abstraction but it 
does not take the generic programming paradigm 
approach. The Amelia vector library uses an STL- 
based mechanism to support reduction operations. 
Combining future abstractions and multi-threaded 
data parallel mechanisms with STL is novel to our 
work. The only other parallel library implemen- 
tation that incorporates STL is the HPC++ im- 
plementation from Indiana University [3]. HPC++ 
does not support future mechanisms, but has a def- 
inition for data parallel algorithms using parallel it- 
erators for an SPMD model and has a distributed 
memory flavor. Though the interfaces and the ar- 
chitecture focus of HPC++ and Coir<Futures> are 
different, the parallel iterators defined in HPC++ 
can be compared to our strategy classes and the 
iterators defined by the classes. Since SPMD pro- 
grams typically assume the “owner computes” rule, 
the HPC++ parallel iterators are designed around 
data distributions, while - since we have an underly- 
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ing MIMD paradigm and a shared memory architec- 
ture - our parallel iterators are designed around task 
distributions. Because massively parallel machines 
like the IBM SP/x are being designed using SMP 
nodes connected over distributed memory spaces, 
our abstractions can form the node component of 
the proposed HPC+-+ abstractions. 


8. Conclusions 

In this article we discussed how the generic pro- 
gramming paradigm can be used for parallel pro- 
gramming. The discussion was in the context of 
Coir<Futures>, our multi-threaded parallel C++ 
library. We introduced two abstractions - one for 
future-based control parallelism and the other for 
data parallel generic algorithms. The idea of fu- 
tures is not new [2, 11, 24] but the notion of com- 
bining them with generic programming paradigms 
is one of the contributions of our work. Futures 
fit well with the template-based programming style 
of STL. Data parallel generic algorithms help reuse 
the corresponding sequential algorithms, provide an 
easy pattern for writing the parallel counterparts 
of the sequential algorithms, and provide a uni- 
form interface and even implementations for dif- 
ferent work distribution strategies in the spirit of 
generic programming. Generic reduction mecha- 
nism, and generic data-parallel algorithm design are 
some of the other contributions of this paper. 

The idea of futures can be extended beyond just 
delayed computation. As futures are place-holders, 
they are similar to proxies. Futures need not only 
represent computation in the current address space, 
but can also be used for remote method invoca- 
tions and persistent data types. The use of object- 
oriented and generic programming paradigms make 
it possible to provide a uniform interface for these 
objects with different behaviors and allow such ob- 
jects to interact easily. 

As we continue to work on the Coir<Futures> 
system, we are also working on more general ideas 
for software patterns [8, 5] for object-oriented par- 
allel programming. Though some language-specific 
work already exists [12], specifying them beyond the 
language barrier will aid in interoperability. 
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A Tool for Constructing Safe Extensible C++ Systems 


Christopher Small 
Harvard University 


Abstract 


The boundary between application and system is 
becoming increasingly permeable. Extensible applica- 
tions, such as web browsers, database systems, and 
Operating systems, demonstrate the value of allowing 
end-users to extend and modify the behavior of what 
was formerly considered to be a static, inviolate system. 
Unfortunately, flexibility often comes with a cost: sys- 
tems unprotected from misbehaved end-user extensions 
are fragile and prone to instability. 

Object-oriented programming models are a good fit 
for the development of this kind of system. An exten- 
sions can be designed as a refinement of an existing 
class, and loaded into a running system. In our model, 
when code 1s downloaded into the system, it 1s used to 
replace a virtual function on an existing C++ object. 
Because our tool is source-language neutral, it can be 
used to build safe extensible systems written in other 
languages as well. 

There are three methods commonly used to make 
end-user extensions safe: restrict the extension language 
(e.g., Java), interpret the extension language (e.g., Tcl), 
or combine run-time checks with a trusted environment. 
The third technique is the one discussed here; it offers 
the twin benefits of the flexibility to implement exten- 
sions in an unsafe language, such as C++, and the per- 
formance of compiled code. 

MiSFIT, the Minimal 1386 Software Fault Isolation 
Tool, can be used as the central component of a tool set 
for building safe extensible systems in C++. MISFIT 
transforms C++ code, compiled by g++, into safe binary 
code. Combined with a runtime support library, the 
overhead of MiSFIT is an order of magnitude lower than 
the overhead of interpreted Java, and permits safe exten- 
sible systems to be written in C++. 


1 Introduction 

Software fault isolation is a technique for transforming 
code written in an otherwise unsafe language (e.g., C or 
C++) into safe compiled code. At transformation time, 
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each read, write, and jump instruction is analyzed and, if 
necessary, transformed to ensure that it will not reach 
outside the memory region assigned to the code. 

Two other techniques for ensuring the safety of 
code are safe languages and interpreted systems. Safe 
languages, such as Java and Modula-3, are designed to 
make it difficult or impossible to write code that per- 
forms illegal or unsafe operations. By definition, safe 
languages are restricted; C++, which allows unchecked 
array accesses, pointer arithmetic, and arbitrary casting, 
is implicitly unsafe. 

Scripting languages, such as Tcl and Perl, enforce 
safety by validating each data access as it takes place. 
Although great strides are being made to improve the 
performance of interpreted languages through the use of 
dynamic code generation [H6lze94], the performance 
overhead is at least a factor of two to ten over native 
compiled code. 

In earlier work [Small96], we measured byte-code 
interpreted Java taking ten to seventy times longer than 
compiled C code performing the same task). The over- 
head of software fault isolation is an order of magnitude 
less than that of interpretation, and SFI techniques have 
the advantage of operating on assembler-level code, so 
they can be used with any source language. 

Although a small number of software fault isolation 
tools exist, and the underlying techniques are not com- 
plex, no tools have been made freely available on com- 
modity platforms such as the x86. MISFIT, the Minimal 
1386 Software Fault Isolation Tool, developed for use 
with the VINO extensible operating system, is such a 
tool. VINO is a new operating system, written in C++, 
designed around the idea that system policies can be 
modified, and kernel components reused, by download- 
ing extensions written by untrusted end-users and pro- 
tected by MISFIT. 

MISFIT includes runtime support necessary to cre- 
ate a sandbox in which the downloaded code will run. 
Additional code (not provided as part of MiSFIT) is 
needed to load the extension into the base system, verify 


1. The tests run in that paper were re-run with MiSFIT for this 
paper. The results are found in Section 6.1. 
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that the code was processed by MISFIT, and offer a 
library of routines that can be called by the extension. 

MISFIT accepts x86 assembler code, produced by 
the Gnu C++ compiler, as input, and produces fault-iso- 
lated x86 assembler code as output. MiSFIT can be used 
as a component of a safe code system, allowing other- 
wise untrusted code to be linked to and run in the con- 
text of an extensible application or system. For example, 
MISFIT can fault isolate dynamically linked extensions 
to world-wide web browsers (e.g., Netscape Navigator), 
kernel extensions (which are supported by a variety of 
current systems, such as Solaris, NetBSD, MS-DOS and 
Windows/NT), and client code linked to a database 
server (e.g., the Illustra database server [Bloor96}). 

Software fault isolation techniques can be imple- 
mented in a compiler pass [Silver96], a filter between 
the compiler and assembler (as in the case of MiSFIT), 
or a binary editing tool [Wahbe93]. MiSFIT works as an 
assembler-level filter for several reasons. First, not writ- 
ing a binary editing tool simplified the task tremen- 
dously, as there was no need to parse, disassemble, 
patch, and reassemble x86 binary code. Another motiva- 
tion was that it conforms to the Unix tool-oriented 
approach for building systems. By not adding it to g++, 
MISFIT has a degree of compiler independence. 
Although MiSFIT makes a (small) number of assump- 
tions about the format of its input, it could easily be 
modified to work with output from other compilers, 
such as lcc or Microsoft C++. 

MISFIT takes the strategy of being platform spe- 
cific and language neutral; the Java Virtual Machine is 
both platform neutral and language neutral. We found 
that any need we had for platform independence was 
outweighed by our need for high performance and the 
ability to write extensions in C++. 


2 SFI Is Not Enough 


MISFIT is not a complete solution to the problem of 
protection from misbehaved extensions. 

First, protection from errant writes and calls is not 
sufficient; the application or kernel must provide a safe 
interface to the extension, or a safe environment in 
which it can run. Protection against illegal stores is use- 
less if the extension can call beopy() with arbitrary 
arguments. Safe equivalents of many other commonly 
used routines, such as read(), write(), and printf() will 
also be needed. 

Second, and more importantly, software fault isola- 
tion (or any other memory protection mechanism) is not 
a substitute for a resource management strategy. An 
extension should not be allowed to allocate memory, 
obtain a lock for a critical data structure, or even be 
given the freedom to run on the CPU, unless some 
mechanism is provided for the resource to be revoked if 
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the extension fails to release it in a reasonable amount of 
time. In related work [Seltzer96], we explored wrapping 
each extension invocation in a transaction; if the exten- 
sion aborted, or failed to complete promptly, our system 
could abort the transaction and nullify any changes 
made by the extension. 

The third way in which MiSFIT is not a complete 
solution is that it, by itself, does not ensure that a given 
piece of binary code has been processed by MiSFIT. 
There are at least two methods for solving this problem. 
First, extension writers can distribute source code for 
their extensions, and the person installing the extension 
could compile and MiSFIT the code before installing it. 
This technique may be reasonable for installing operat- 
ing system extensions, as is done now with loadable ker- 
nel modules in NetBSD and Linux. 

The second method is more end-user-friendly, but is 
logistically more complex. Code processed by MiSFIT 
would be given a cryptographic digital signature, either 
by the tool itself or by a signing authority. This signature 
would then be checked at load time. In order to support 
this scheme it would be necessary to find a trustworthy 
authority willing to MiSFIT and sign code, or somehow 
safely hide the apparatus for generating the signature 
within MiSFIT itself. 

Although there are pieces missing from MiSFIT to 
make it a complete environment for building extensible 
systems, they are both technically tractable and applica- 
tion specific. For our project (the VINO extensible oper- 
ating system ([Seltzer94]), we have developed a 
protected runtime environment, resource management 
infrastructure, and code signature scheme for use with 
MISFIT. Other applications of MiSFIT would necessar- 
ily have a different safe runtime environment and 
resource management infrastructure. 

The remainder of this paper focuses on related 
work, the architecture of MiSFIT, and its runtime sup- 
port. Section 3 contains a discussion of related work in 
extension technology. Section 4 discusses the design 
and implementation of MiSFIT, and Section 5 covers the 
related runtime support. Section 6 includes the overhead 
of MiSFIT on benchmark programs. Section 7 discusses 
what has been left out of MiSFIT, and the paper con- 
clude in Section 8. 


3 Related Work 

The term Software Fault Isolation was introduced by 
Wahbe et al. [Wahbe93]. They proposed a type of soft- 
ware fault isolation, sandboxing, which has low over- 
head on a processor with a large number of registers. 


2. Our code signature implementation uses the RSAREF 
library [RSA], which is export controlled. 
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Their tool was originally targeted for the MIPS and 
Alpha processors. The initial results for this work show 
overheads of roughly five percent to ten percent. 

A follow-on to that work is the Omniware Portable 
Code system. The Omniware compiler generates porta- 
ble code for an abstract virtual machine (OmniVM) 
which 1s translated to native fault-isolated code at runt- 
ime [Ad196]}. Along with the source language indepen- 
dence provided by software fault isolation techniques, 
the Omniware system also offers target-independent 
portable code. 

Silver has developed a version of gcc which gener- 
ates software fault isolated code for the DEC Alpha pro- 
cessor [Silver96]. Most of the modifications to gcc were 
made in the machine-independent portion of the com- 
piler, although some changes were needed in the 
machine dependent portion of the code. The author 
reports that the implementation is dependent upon a 
large number of registers being available for use by the 
tool; a port to x86, which has a severely limited register 
set, appears to be difficult, if not impossible. 

Several other researchers in the area of extensible 
operating systems have developed one-off software fault 
isolation tools, including Banerji [Banerji96], Engler 
[Engler95], and Mazieres [Mazieres96]. Unfortunately 
these tools suffer from working on less widely used plat- 
forms, working only with domain-specific languages, or 
not being publicly available. 

Some extensible systems designers have followed a 
different route, proposing that extensions be written in a 
safe language (e.g., the SPIN operating system 
[Bershad95], which uses Modula-3 [Nelson91]], and 
Netscape Navigator, which uses Java [Gosling96]). Safe 
languages can perform as well or better than software- 
fault-isolated unsafe languages, but have the two disad- 
vantages that there is no possibility of reusing existing C 
or C++ code, and that programmers need to develop 
extensions in the safe language, and not the more famil- 
lar and common unsafe languages. The performance 
overhead of Modula-3 relative to compiled C or C++ 
appears to be negligible, but Java (which is most often 
interpreted by a virtual machine) is 20 to 50 times 
slower than equivalent compiled C code [Small96]. 

The Netscape Navigator world-wide web browser 1s 
an interesting example of an extensible system. The cur- 
rent release (3.0) supports two types of extensions: those 
written in Java (a safe language) and JavaScript (an 
interpreted scripting language). In order for Netscape 
Navigator to support extensions written in Java on all 
platforms, a complete implementation of the Java inter- 
preter and runtime environment must be developed on 
each platform. It is arguably less work to construct a 
simple software fault isolation tool for a hardware archi- 


tecture than to develop or port an interpreter and runtime 
environment. 

Although in previous work we have measured inter- 
preted Java as running ten to seventy times slower than 
compiled C, several companies plan to release “‘just-in- 
time” native code compilers for Java>. These compilers 
would convert Java bytecode into native code as it is 
loaded (or first run). The overhead of running “just-in- 
time’”’ compiled code has been measured at two to ten 
times that of regular compiled code [H6lze94], which 
would give Java roughly the same performance as soft- 
ware fault isolated code. 

Microsoft offers the ActiveX extension mechanism, 
which provides no technical guarantee of safety, but 
instead supplies only a method for verifying the identity 
of the provider of the code through the use of digital sig- 
natures. Software fault isolation can work in concert 
with digital signatures, to guarantee both the identity of 
the provider and the safety of the code. 

The design of the VINO extensible operating sys- 
tem, which is the primary testbed for MISFIT, is 
described in more detail in other work [Seltzer94, 
Seltzer96]. 


4 MiSFIT Design and Implementation 
Software fault isolation can be used to protect against 
illegal jumps, stores, and loads. Protecting against ille- 
gal stores and jumps is necessary for correctness, but 
protection from illegal reads is usually a security issue, 
not a correctness issue. (If an extension can read outside 
its memory bounds, it may be able to find data it should 
not be allowed to see, but if an extension can write or 
jump to an arbitrary location in memory, the stability 
and correctness of the host program can be compro- 
mised.*) 

MiSFIT can be used to fault isolate indirect loads, 
stores, and calls. It acts as a filter, sitting between the 
compiler and the assembler. MiSFIT scans the output of 
the compiler and builds an in-memory representation for 
the module. It then processes each instruction of the 
module in turn. If any implicitly unsafe instruction (e.g., 
halt) appears, the module is rejected. The arguments for 
each store, call, and (optionally) load instruction are 
examined. (Constants and general-purpose registers are 
implicitly safe.) Once the module has been processed, 
simple peephole optimization is performed (to remove 


3. Symantec has shipped a just-in-time compiler, and Sun has 
announced plans to do so. 


4. This 1s not necessarily the case inside the operating system 
keel; on some hardware, such as the x86, device registers are 
mapped into memory and reset themselves after being read. 
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Original address Region tag 
Lest 
Ox-----0000 || Oxabcd0000 | 


Offset into region Fault-isolated address 


Region tag 
Oxabcd---- 


Original address 
Oxabcd1 234 








Oxabcd1234 


Fault-isolated address 





Offset into region 


Figure 1: Example Transformations. In this example, the 
region tag is the top sixteen bits of the address and has the 
value Oxabcd. In the first example, the original address is 
invalid, so the fault-isolated address is different. In the 
second example, the original address is within the region, 
so the fault-isolated address is the same as the original 
address. 


any redundancies introduced by the SFI transformation) 
and a new copy of the module is written out. 


4.1 Indirect Loads and Stores 

Loads and stores that use an indirect address that is 
computed at run-time are potentially unsafe. MiSFIT 
inserts code to sandbox [Wahbe93] arguments of these 
instructions to force the indirect address fall within a 
legal range. 

Each user extension Is assigned a contiguous region 
of memory into which it can write, and a region from 
which it can read. (These regions would normally at 
least overlap, if not be the same, but it is not necessary.) 

MISFIT requires that the size of each memory 
region be a power of two; because of this, the high bits 
of each address in the memory region (the region tag) 
will be the same. To sandbox a memory reference, MiS- 
FIT simply sets the high bits of the reference to the 
region tag of its associated memory region. Any load or 
store that would have accessed memory outside its 
region is thus forced to fall somewhere inside the exten- 
sion’s memory region. Note that if the fault isolated tar- 
get address was already in the extension’s memory 
region, it does not change. The fault isolated address dif- 
fers from the original target address only if the original 
target address was outside the extension’s memory 
region (and therefore illegal). Examples of this transfor- 
mation are given in Figure 1. 

There is one more detail: in order to preclude the 
code from (unsafely) modifying itself, the writable 
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movl eax,0(edx) ; do the store 


is transformed into: 
andl $0xffff,edx 
orl destmask,edx 
movl eax,0(edx) 


; clear old region tag 
; set our region tag 
- do the store 


movl eax, 12(ebx,ecx) ; do the store 
is transformed into: 


push] edx ; obtain scratch register 
leal 12(ebx,ecx),edx ; load target address 
and] $0xffff ,edx ; clear old region tag 


orl destmask,edx 
mov] eax,0(edx) 
popl edx 


; set our region tag 
; do the store 
; restore scratch register 


Figure 2: Sandboxing transformations for a store 
instruction. In the first case the target is a simple 
indirection through a register; in the second case it is a 
complex indirection, so a scratch register is first made 
available and the target is loaded into the scratch register 
before sandboxing. In this example, the size of the assigned 
memory region is 64KB (the argument to the andl is 
Oxffff). Note that all of the added instructions take one 
cycle on the Pentium (assuming that the stack targets of the 
push and pop are in the first level cache). Note: the general 
format of x86 assembler instructions is insér src, dest. 


region should be chosen so that it does not overlap the 
code space assigned to the extension. 

MISFIT modifies the loads and stores in the follow- 
ing way. First, it inserts code to load the target address 
into a register (if it is not already in a register). The high 
bits of the register are then cleared, and the region tag of 
the associated memory region is then OR’d into the reg- 
ister. The register is then used in place of the operand in 
the original instruction. 

Depending on whether the target address was 
already in a register, this technique adds either two or 
five instructions>. If the original operand is an indirec- 
tion through a single register (with no constant offset) 
only two instructions are needed, an AND to clear the 
high bits of the register and an OR to set the region tag. 
If the target address is not already in a register, MiSFIT 
inserts five instructions: MiSFIT obtains a scratch regis- 
ter (by pushing its current value on the stack), loads the 
effective target address into the scratch register, masks 
in the region tag as above, and restores the scratch regis- 
ter. 

Examples of these transformations are shown in 
Figure 2. Note that in the second case it would be possi- 
ble to save the scratch register push and pop 1f MiSFIT 


5. Each instruction is executed in one cycle on the Pentium, 
assuming all memory references hit in the L1 cache. 
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were able to determine that there was a dead register® 


available that could be used as a scratch register. The 
MISFIT performance impact is low enough that we have 
not yet been tempted to perform this optimization. 


4.2 Virtual Function Calls 

When a virtual function call takes place MiSFIT must 
verify that the target address is one that the extension is 
permitted to call. If the extension were allowed to indi- 
rectly call to any address, it not only might obtain access 
to an unsafe function, it also might jump into the middle 
of an instruction or into data space, which would open 
all sorts of security and safety holes. 

MiSFIT restricts the extension by searching a table 
of valid function targets on each indirect call from an 
extension. The builder of the base system provides a file 
with the names of the functions that an extension may 
call; an auxiliary tool (provided with MiSFIT) deter- 
mines the start address of each of these functions at link 
time, and places the addresses into a table that is linked 
into the base system. 

Although there may be an arbitrarily large number 
of valid target addresses, the tool greatly limits search 
time by storing the valid addresses in a sparsely popu- 
lated open addressed hash table [Cormen90]. An open 
addressed hash table is implemented as an array; the 
hash value of the key gives the index of the array to 
check. When the tool adds items to the hash table, if the 
key hashes to n and location n of the table is already in 
use, it check locations n+ /, n+2, and so on, until it finds 
a free slot for the value. When searching for a key in the 
table, the search function hashes the key, yielding n, and 
then check location n of the table. If location n has a 
value (but not the key) it checks location n+/, n+2, and 
so on, until it either find the key (signifying success) or 
find an empty slot (signifying failure). 

One subtle advantage of using an open addressed 
hash table is that if the search function does not find the 
key at location n, because the next location checked (at 
index n+J/) is at an adjacent memory location, it is likely 
to be in the cache. So, even if it fails on the first probe of 
the table, the cost of subsequent probes is reduced. 

By decreasing the density of the table, it is possible 
to reduce the number of probes needed nearly to unity 
(the theoretical minimum). With a table that has a 50% 
density (half the slots are empty) an average of fewer 
than 1.5 probes per indirect call are required. The over- 
head of each probe is roughly six to ten cycles (assum- 
ing everything hits in the L1 cache), adding, on average, 


6. A dead register is one that will not be read again before it is 
written. 


approximately ten to fifteen cycles to each indirect func- 
tion call. 

Indirect calls are common in C++ code, as virtual 
functions are implemented as indirect calls. When pro- 
tecting C++ code with MiSFIT the table of valid func- 
tion targets can become quite large, but the per- 
invocation cost remains low, because the number of 
probes into the table is independent of the size of the 
table, depending only on its density, which is under 
MiSFIT’s control. 


4,3 Global Data, Virtual Function Tables 
Because MiSFIT sandboxes global memory references, 
any data accessible to the extension must be placed in 
the memory region assigned to the extension. If there is 
global data that the extension should be able to access, 
the data should be placed in the memory region assigned 
to the extension. This applies not only to global program 
data, but other shared state, such as virtual function 
tables. 

The restriction on global program data is a problem 
if multiple extensions are to be granted access to the 
same datum. A work-around is for the application to 
provide functions to access the data; each extension will 
be given permission to call these accessor functions, and 
use them instead of directly reading and writing the 
data. 

This technique has an impact on performance that is 
difficult to quantify, as the cost is a function of the 
amount of data that is protected in this way, the fre- 
quency of access, and the type of interface the functions 
provide. In two of the three tests discussed in this paper 
this cost is not to quantified; in the third, the cost is built 
in to the overall model, but not factored and measured 
separately. 

Virtual function tables are a different matter. If 
MISFIT is configured to use read protection, virtual 
function tables need to be in a region of memory that is 
readable by the extension. The solution we have chosen 
for VINO is to store all virtual function tables in a con- 
tiguous region of memory (by making a one-line change 
to g++), and mapping that region into each extensions 
read-only region. 


4.4 Block Instructions 

The x86 instruction set includes memory-to-memory 
move and comparison instructions, movs and cmps, 
which take four or five clock cycles on the Pentium. The 
same goal can be accomplished by four one cycle 
instructions (assuming a scratch register is available). 
However, the memory to memory instructions have the 
advantage that they can be used to construct block move 
and compare sequences. The x86 rep instruction can be 
used as a prefix to the memory-to-memory instructions; 
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the rep prefix instructs the processor to repeat the mem- 
ory-to-memory instruction for count times, where count 
is the value in the %ecx register. The block move 
instruction sequence has a lower per-move overhead 
than a sequence or loop of individual memory-to-mem- 
ory move instructions, and can be generated by compil- 
ers to perform structure copies and in-line expansions of 
common C library functions such as stremp() and 
beopy(). 

MISFIT transforms the base addresses and repeat 
count of arguments to the block instruction, sandboxing 
the compound instruction as a whole. Although this 
adds a high fixed overhead to the block instruction 
(roughly 26 cycles), there is no per-element cost. The 
alternative, transforming the block instruction into a 
loop and sandboxing the instructions in the loop, has a 
high per-element overhead; the break-even point for the 
two techniques is at three or four iterations. Block 
instructions are typically used for copying or moving 
more than four elements, so the fixed overhead imposed 
by MiSFIT’s technique is preferable. 


4.5 Saved Registers and Return Addresses 
Protecting the contents of the stack is also problematic. 
The stack is used not only for local variables (which 
must be accessible to the user extension) but also saved 
registers and the function return address (which should 
not be accessible to the user extension). If the user 
extension could write to arbitrary locations on the stack, 
the return address of the function could be overwritten 
and set to an arbitrary value, circumventing the call pro- 
tection offered by MiSFIT. 

A second problem is that the process stack is nor- 
mally not in the same region of memory as the heap and 
global data; MiSFIT’s technique depends on all valid 
memory references falling within a single region of 
memory. In a multi-threaded environment (either a 
multi-threaded operating system kernel or multi- 
threaded end-user application) each thread of control is 
assigned its own stack. In environments where the 
extension can be run as a separate thread of control, 
MiSFIT can co-locate the stack assigned to the thread 
(i.e. assigned to the extension) with the memory region 
assigned to the extension. Then all valid memory refer- 
ences made by the extension will fall within a single 
region. 

In environments where there is a single thread of 
control, MiSFIT can provide the same type of protection 
by providing each extension with its own stack, located 
in its memory region. When the extension is invoked, 
the application switches to the stack associated with the 
extension. When the extension returns to the applica- 
tion, the process switches back to the original stack. 
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To solve the problem of an extension overwriting a 
retum address on the stack, MiSFIT replaces each call 
instruction within the extension with a call to a support 
routine that saves the return address in a separate stack 
outside the extension’s memory region and then jumps 
to the called function. MiSFIT then replaces each ret 
instruction with a jump to a second support routine that 
loads the saved return address and jumps to it. In this 
way, even if the extension misbehaves and overwrites 
the return address, the system returns to the correct loca- 
tion. To ensure that register values are preserved across 
the invocation of the extension, MiSFIT stores the con- 
tents of all callee-saved registers on entry to the exten- 
sion, and reloads these values when it returns. 


4.6 Dynamic Linking 

MiSFIT modifies the operands of load, store, and call 
instructions that are computed at runtime. It does not 
modify operands that are labels, assuming that refer- 
ences to addresses within the module (i.e. local jumps, 
and loads and stores of module-level variables) are 
implicitly safe (generated by the compiler), and refer- 
ences to addresses outside the module will be checked 
by the dynamic linker when they are resolved. This 
implies that the dynamic linker is responsible for keep- 
ing track of which symbols may be linked to by an 
extension. Under some circumstances it may be the case 
that not all extensions will be given access to the same 
set of entrypoints. If this is the case, the dynamic linker 
is responsible for determining to which entrypoints a 
given extension should be given access. 

Relinquishing responsibility for protecting external 
symbols has a limitation. The assembler does not mark 
external symbols as being for read or write use; a single 
external reference is generated for all reads and writes. 
If there is no read protection, but there is write protec- 
tion, there is no way for the linker to discern which ref- 
erences are source (read) references and which are 
destination (write) references — in other words, which 
should be allowed, and which should be disallowed. 

To solve this problem, MiSFIT generates a table of 
addresses of instructions that write operands that are 
labels. The dynamic linker can use the information in 
this table, in addition with the external reference table, 
to differentiate between read references and write refer- 
ences at link time. 

4.7 An Alternative to Sandboxing on the x86 

On the x86, an alternative to sandboxing exists. The 
bound instruction checks that a value falls within a 
specified range; if it does not, a trap occurs. If this trap 
can be caught, the ill-behaved extension can be stopped 
before it does any damage. It appears that the bound 
instruction was designed to be used for array bounds 
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checking. since it performs a signed (rather than 
unsigned) comparison. This does not preclude using it 
for SFI. MiSFIT might arrange for all parts of the region 
of memory assigned to an extension have the same sign 
(i.e. not cross the border between location Ox7fffffff and 
location 0x80000000), so the signed nature of the com- 
parison would not be a problem. The bound instruction 
takes more cycles than the instructions needed to set the 
high bits of a register (eight vs. two); however, instead 
of neutering an illegal load or store the bound instruc- 
tion would trap an illegal memory access. 

This paper includes results of running tests compar- 
ing the performance of MiSFIT using the bound 
instruction and the sandboxing technique described in 
Section 4.1. The results show that the sandboxing 
method has superior performance, which is not surpris- 
ing, considering the cost of the bound instruction in 
comparison with the cost of sandboxing. 


5 Runtime Support 

MiSFIT includes runtime support for linking extension 
code as new virtual functions to existing objects, setting 
up the state of an extension, and managing free store for 
the extension. 


5.1 Virtual Function Table Manipulation 

In the MiSFIT model, an extension is used to modify the 
behavior of a single object, by replacing a virtual func- 
tion of that object. MiSFIT accomplishes this by making 
a copy of the virtual function table for that object and 
writing a new value into the slot corresponding to the 
replaced function. 

The process by which an extension is called is 
somewhat baroque. MiSFIT can not just replace the 
address of the old function with the address of the newly 
loaded function in the virtual function table. As outlined 
above, when an extension is called its sandbox needs to 
be configured. 


5.2 Calling The Extension 

When an extension is installed, a small assembler stub 
function (similar to a closure) is created. This stub is 
responsible for configuring the sandbox and calling the 
extension. The stub is specific to the particular exten- 
sion, because it includes the addresses of the extension’s 
sandbox regions, as well as the address of the extension 
function itself. 

The stub sets up the sandbox for the extension. It 
first saves callee-saved registers (as MiSFIT does not 
trust the extension to do so). The stub sets up the global 
variables that hold the region tags for the read and write 
(source and destination) regions assigned to the exten- 
sion, and copies any arguments passed to the extension 


onto the extension’s stack. It switches to the extension’s 
stack, and jumps to the extension. 

When the extension completes, it jumps to the 
returns stub (remember that the extension’s ret instruc- 
tion was replaced by this jump, as described in Section 
4.5), which switches to the regular stack, loads the saved 
registers, and returns to the base system. 

The runtime support code also includes the function 
that implements safe indirect calls (as described in Sec- 
tion 4.2). MiSFIT replaces indirect calls with code that 
loads the target function address and calls the hash table 
lookup function. If the function address is not found in 
the hash table by the lookup function, the function calls 
an abort function, which is responsible for cleaning up 
after the extension. 


5.3 Extension Free Store Management 

As the code running in an extension cannot reach out- 
side its bounds, if it were to allocate storage using new 
it would not be able to read from nor write to that stor- 
age. MISFIT provides a small heap in the data area 
assigned to the extension, and simple implementations 
of the built-in new and delete functions. When MiSFIT 
is processing an extension, it replaces any calls to the 
built-in new and delete functions with calls to the MiS- 
FIT versions. 


6 MiSFIT Overhead 

This section compares the performance of unprotected 
code (written in C or C++) with the MiSFIT-protected 
versions. Times are reported as a percentage of the 
unprotected versions. Performance numbers for both 
write-call (where store and call instructions are pro- 
tected) and read-write-call (where load, store, and call 
instructions are protected) tests are included. As pointed 
out above, read protection is typically a requirement for 
security, not for correctness. 


6.1 Operating System Extensions 
In previous work [Small96], we examined the suitability 
of various extension technologies for constructing oper- 
ating system extensions. Three tests were developed and 
used, with each test representing a class of possible OS 
extensions. Following is a short description of each test; 
for more detail, the reader is directed to the earlier 
paper. 
¢ hotlist: choose which page to evict from a linked list 
of page descriptors. 
e lld; simulate the operation of a logical disk layer 
[DeJon93}. 
¢ md5; compute the MDS5 checksum [RFC1321]‘ of 
1 MB of data. 
The tests were run on a 120MHz Pentium with 
64MB of EDO memory, running BSD/OS 2.1. Each test 
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and its data fit into main memory. Times are reported 
relative to the unprotected version of the code. The 
results are found in Table 1. 

The write-call overhead for these tests is low, at 
most 10%. The overhead for read-write-call protection 
can be much higher, over 200%. 

In our earlier work we computed a break-even point 
for each operating system extension. If the cost of using 
the extension is below the break-even point, the exten- 
sion will improve overall system performance; if it 
above this point, it will degrade system performance. 
The three write-call protected tests fall below the break- 
even point, as do the read-write-call versions of l/d and 
md5, but the read-write-call version of hotlist does not. 


MiSFIT MiSFIT 
Write-Call Read-Write-Call 
Test Protected Protected 
(MiSFIT/ (MiSFIT/ 
unprotected) unprotected) 

hotlist ~ 1.00 3.2 
lld 1.07 1.4 
md5 1.09 Ly 


Table 1: Relative overhead of MiSFIT-protected code to 
unprotected code on operating system extension benchmarks. 
The cost of isolating writes and indirect writes is low, under 
10%, but the cost of protecting reads as well can be 
prohibitively high. 


The performance of the write-call protected hotlist 
is equivalent to the unprotected version. This is because 
there are very few protected write instructions executed 
during the test. Because the kernel of the test repeatedly 
scans a linked list of page descriptors, the number of 
read instructions executed is very high. This bias is 
reflected in the performance of the read-write-call pro- 
tected version of this test, where the overhead is more 
than 200%. 

The /ld test has a noticeable but small write-call 
overhead of 7%; read protection adds another 33%. This 
test is not as read-intensive as hotlist, so the added over- 
head of read protection is much lower. The md5 test has 
similar performance characteristics, with a sub-10% 
write-call overhead, and an additional 60% overhead for 
read protection. 


6.2 SPECInt92 
This experiment shows the results of several SPECInt92 
benchmarks processed by MiSFIT, using write-call and 
read-write-call protection. The performance of the MiS- 
FIT-protected code relative to native code is reported in 
Table 2. 

Although it is unlikely that anyone would want to 
load a SPEC benchmark into a web browser or database 
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MiSFIT MiSFIT 
Write-Call Read-Write-Call 
Test Protected Protected 
(MiSFIT/ (MiSFIT/ 
unprotected) unprotected) 





compress 1.09 1.26 
espresso hel 1.76 
eqntott 1.02 1.68 
li 1.17 1.61 


Table 2: Overhead of protection on SPECInt benchmarks for 
MISFIT, relative to unprotected code. MiSFIT times are the 
mean of ten runs. Standard deviations were less than 1%, 
except for compress, where it was 2.6%. 


server, these results give a feeling for the overhead 


imposed by MISFIT on “typical” code. (To better esti- 
mate the overhead imposed by MISFIT, the tables only 
include time spent at user level.) 

The write-call MiSFIT overhead for the SPEC92Int 
code is comparable to that of MiSFIT on the operating 
system extension benchmarks, ranging from a factor of 
1.02 to a factor of 1.17. As is seen above, the overhead 
of read-write-call protection is higher than the overhead 
for write-call protection, on the order of 1.26 to 1.76. 
This overhead is large, but still substantially less than 
that of an interpreted language. 

For memory-intensive applications, such as data 
copies, a higher overhead should be expected. The over- 
head seen is, of course, a function of the ratio of pro- 
tected instructions to unprotected instructions. 


6.3 VINO Kernel Extensions 

MiSFIT is used to protect the VINO operating system 
kernel from misbehaved end-user extensions. We mea- 
sured the performance overhead of MiSFIT on four ker- 
nel extensions [Seltzer96], and include these results 
here. For these tests we used MiSFIT for read-write-call 
protection, and the overhead shown is in line with the 
overhead seen above. 

The Read-ahead extension specifies which disk 
block to read next, by returning a value found in its 
memory region. This code performs little computation, 
so the overhead imposed by protecting its loads and 
stores dominate its performance. 

The Page Eviction extension is similar to the hotlist 
extension described in Section 6.1, but instead of 
searching a linked list it searches an array. Because there 
is less pointer chasing, the overhead imposed by MiS- 
FIT is lower. 

Each time the Scheduling extension is called it 
searches a list of 64 process IDs. Because the code that 
traverses the list is trusted code (is part of the base sys- 
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tem, outside the extension itself, unlike in the case of 
hotlist), the overhead of using MiSFIT is much lower’. 

The fourth extension, which performs a simple 
encryption of a data stream, is data intensive. It copies 
8KB of data from an input buffer to an output buffer, 
applying a trivial (XOR-style) encryption to the data. 

This extension was designed to be a worst-case test 
for MiSFIT, with little computation done between each 
data load and store. The MiSFIT version of the code 
takes slightly more than twice as long as the unprotected 
code. It is theoretically possible for MiSFIT-protected 
code of this form to take as much as six times as long as 
protected code (remember MiSFIT can add five instruc- 
tions for each load and store), but it is difficult, if not 
impossible, to construct a real-world example where 
every instruction is a load or store. One possible case is 
when data is being copied directly from one buffer to 
another (as is done in this example), but the overhead 
seen here is 100%, not 500%. In the case of a straight 
data copy (using the x86 rep; movs instruction pair), 
MISFIT uses a different technique for fault isolation 
which has lower overhead (see Section 4.4). 





MiSFIT 
Test Read-Write-Call 
Protected 
(MiSFIT/unprotected) 
Read-ahead 2.5 
Page Eviction 1.2 
Scheduling 1.1 
Encryption 201 


Table 3: VINO Kernel Extensions: MiSFIT was used to 
apply read-write-call protection, which causes overhead in line 
with the results seen above. Each test was run between 300 
and 3000 times; the standard deviation of each result was less 
than 2.5%. 


6.4 Performance Summary 

With read-write-call protection MISFIT protected code 
can take from 1.4 to 3.2 times as long as unprotected 
code. Although this overhead may seem large, it should 
be compared to the overhead of an interpreted safe lan- 
guage, such as current Java implementations (which are 
20 to 50 times slower than compiled C code), or the dis- 
advantage of writing extensions in an unfamiliar, but 
safe, compiled language, such as Modula-3. 


7. Calling trusted code outside the extension is analogous to a 
Java application calling mative methods, which are 
implemented in compiled C or C++. 


7 What Is Missing 


As shown in Section 2, SFI is not a complete solution. 
The MiSFIT package does not include a safe runtime 
support library, which would be specific to the base sys- 
tem. This support library would be responsible for 
ensuring that extensions do not violate their resource 
limitations. 

Extensions do not have access to the global heap; a 
version of malloc (or new) is needed that allocates 
memory from a pool inside the extension’s writable 
memory region. 

MiSFIT does not include a dynamic _ linker. 
Depending on its application, a dynamic linker may 
already be part of the system (e.g., NetBSD). The 
dynamic linker, or some code-signing tool, would be 
responsible for verifying that the loaded code had been 
processed by MiSFIT. 

One restriction that is not currently addressed, but 
should be, is the difficulty of passing arguments and 
retums by reference. When calling an extension, the 
calling stub pushes arguments onto the extension’s 
stack, but these arguments are currently restricted to 
immediate values. If the base system wants to pass an 
argument by reference (via a pointer) there is currently 
no way to do so. Additionally, there is no way for an 
extension to pass back data other than as the return value 
of the function or by storing the results in its writable 
memory region for later retrieval by the base system. 

The solution to this limitation is the application of 
standard techniques for marshalling and unmarshalling 
arguments for remote procedure calls. By specifying the 
number and types of parameters to the extensions with 
an interface definition language, extension-specific stub 
functions could be generated that would copy arguments 
into the extension’s address space when it is called, and 
copy results back to the base system when it returns. 


8 Conclusions 

The overhead imposed by MISFIT when it is used for 
write and call protection is small. It allows applications 
and kernels to be protected from end-user extensions 
written in otherwise unsafe languages. Unlike other 
tools, it is freely available. As part of an end-to-end 
solution to the problem of constructing an extensible 
system, M1SFIT can provide safety at low cost. 


9 Availability 
MiSFIT is covered by a BSD-style license, and is avail- 
able for public use without fee. Contact the author 
(chris @eecs.harvard.edu) to obtain a copy of the code. 
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Krakatoa: Decompilation in Java 
(Does Bytecode Reveal Source? } 


Todd A. Proebsting Scott A. Watterson 
The University of Arizona * 


Abstract 


This paper presents our technique for automati- 
cally decompiling Java bytecode into Java source. 
Our technique reconstructs source-level expres- 
sions from bytecode, and reconstructs readable, 
high-level control statements from primitive goto- 
like branches. Fewer than a dozen simple code- 
rewriting rules reconstruct the high-level state- 
ments. 


1 Introduction 


Decompilation transforms a low-level language into 
a high-level language. The Java Virtual Machine 
(JVM) specifies a low-level bytecode language for a 
stack-based machine [LY97]. This language defines 
203 operators, with most of the contro! flow speci- 
fied by simple explicit transfers and labels. Compil- 
ing a Java class yields a elass file that contains type 
information and bytecode. The JVM recruires a sig- 
nificant amount of type information from the class 
files for object linking. Furthermore, the bytecode 
must be verifiably well-behaved in order to ensure 
safe execution. Decompilation systems can exploit 
this type information and well-behaved property to 
recover Java source code from the class file. 

We present a technique for transforming low-level 
Java bytecode into legal Java source code. Our sys- 
tem, Krakatoa,! performs type inference to issue 
local variable declarations. ‘The verifier does the 
same type of type inference, and the techniques are 


Address: 
versity of Arizona, Tucson, AZ 85721; Email: 
saw } @cs.arizona.cedu. 

1Krakatoa is a volcano located in the Sunda Strait be- 
tween Java and Sumatra. Its 1983 eruption threw five cubic 
miles of debris into the air and was heard 2200 miles away 
in Austraha. 


Department of Computer Science, Uni- 
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well known. Presently, we focus our research on 
two subproblems: recovering source-level expres- 
sions and synthesizing high-level control constructs 
from goto-like primitives. 


Krakatoa uses a stack-simulation technique to re- 
cover expressions and perform type inference. Ex- 
pression recovery creates source-level assignments 
and comparisons from primitive bytecode opera- 
tions. We extend Ramshaw’s goto-elimination al- 
gorithm to structure (and create source for) ar- 
bitrary reducible control flow graphs. This tech- 
nique produces source code with loops and multi- 
level break’s. Subsequent techniques recover more 
intuitive constructs (e.g., if statements) via appli- 
cation of simple code rewrite rules. 


Traditional decompilation systems use graph 
transformations to recover high-level control con- 
structs. These systems require the author of the 
decompiler to anticipate all high-level control id- 
ions. When faced with an unexpected language 
idiom, these systems either abort, or produce gotos 
(illegal in Java). Krakatoa represents a different ap- 
proach. Krakatoa first produces legal Java source 
given legal Java bytecode with arbitrary reducible 
control flow, and then recovers intuitive high-level 
constructs from this source. 


Figure 1 gives the five steps of decompilation 
performed by Krakatoa. First, the expression 
buslder reads bytecode, recovers expressions and 
type information, and produces a control flow 
graph (CFG). Next, the sequencer orders the CFG 
nodes for Ramshaw’s goto-elimination technique. 
Ramshaw’s algorithm produces a convoluted—yet 
legal—Java abstract syntax tree (AST). Our sys- 
tem then transforms this AST into a less convo- 
luted AST using a set of simple rewrites. The final 
phase produces Java source by traversing the AST. 
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Java Bytecodes 


Expression Builder 





Flow graph with expressions and conditional gotos 


Node Sequencer 


Augmented Flow graph 


Goto Eliminator (Ramshaw’s Algorithm) 












Code Simplifier 


Restructured Java AST 





Final Java printer 


Java Source 


Figure 1: Java Bytecode Decompilation System 


2 Expression Recovery 


Java bytecodes bear a very close correspondence 
to Java source. As a result, recovering expres- 
sions from Java bytecode is often simple—much 
simpler than recovering expressions from machine 
language. Java class files include information that 
makes recovering high-level operations like field ref- 
erences easy. ‘The fact that the bytecode must be 
well-behaved (i.e., verifiable) also simplifies analy- 
sis. Figure 2 gives a sample program and its abbre- 
viated disassembly. Note the level of type informa- 
tion in the disassembly produced by Sun’s jJavap 
utility. 

Symbolic execution of the bytecode creates the 
corresponding Java source expressions. It also cre- 
ates conditional and unconditional goto’s, which 
will be removed by subsequent decompilation steps. 
Symbolic execution simulates the Java Virtual Ma- 
chine’s evaluation stack with strings that represent 
the source-level expressions being computed. For 
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class foo { 
int sam; 


int bar(int a, int b) f{ 
if (sam > a) { 
b = a*2; 
} 


return b; 


; 


Compiled from foo.java 

class foo extends java.lang.Object { 
int sam; 
int bar(int,int); 


Method int bar(int,int) 
0 aload_0O 

getfield #3 <Field foo.sam I> 

l1load_i 

if_icmple 12 

1load_1 

iconst_2 

10 imul 

11 istore_2 

12 iload_2 

13 ireturn 


on ff} = 


Oo 


Figure 2: Simple Method and Bytecode Disassem- 
bly (via javap —-c). 


instance, iload_1, which loads the value of the 
first local variable—with type int—could be rep- 
resented on the stack as “ii”. Similarly, if i1 and 
2 were the top two elements of the symbolic stack, 
and the next bytecode were iadd (integer addition), 
those elements would be popped off the stack and 
replaced with “(ii+2)”. The symbolic execution 
of some expressions, like assignment, requires emit- 
tang Java source. 


Our algorithm recovers expressions one basic 
block at a time. Some basic blocks (such as those 
produced by the conditional expression operation, 
A?B:C) do not begin with empty stacks, so some 
information is required to propogate from prede- 
cessors. Also, basic blocks that begin exception- 
handling blocks—which are easily identified—begin 
with the raised exception on the stack. 


Figure 3 provides the step-by-step decompilation 
of the bytecode in Figure 2. The initial aload_0 


USENIX Association 


USENIX Association 


instruction pushes a Java reference onto the stack. 
In virtual functions, the “Q’th” local variable, a0, 
always refers to this. The getfield instruction 
references a named field, “sam”, of the current top 
of stack. Therefore, the “this” is popped and 
replaced with “this.sam”’. iload_1i pushes “i1” 
onto the stack. The ifcmple compares the top two 
stack elements and branches to the appropriate in- 
struction if the lower is less than or equal to the 
top element. Symbolically executing the ifcmple 
requires popping the top two elements and emit- 
ting the appropriate conditional branch. Translat- 
ing the remaining instructions is similar. 

Most of the bytecode instructions are equally 
simple to symbolically execute. Unfortunately, a 
few require more information. Some of the stack 
manipulation routines (e.g., pop2, dup2, etc.) de- 
pend on byte offsets from the stack top. For in- 
stance, pop2 removes the top 8 bytes from the 
stack, whether those 8 bytes represent one 8-byte 
double value, or two 4-byte scalar values. To cor- 
rectly simulate these instructions the symbolic ex- 
ecution keeps track of the size (and type) of each 
stack element. 


3 Instruction Ordering 


After recovering expressions, conditional and un- 
conditional goto’s (along with implicit fall through 
behavior) determine control flow. Java, however, 
has no goto statement, so its control flow must be 
expressed with structured statements. 

Ramshaw presented an algorithm for eliminating 
goto’s from Pascal programs while preserving the 
program’s structure [Ram88]. This algorithm re- 
places each goto with a multilevel break to a sur- 
rounding loop. The algorithm determines the ap- 
propropriate locations for these surrounding loops. 
We trivially extended his algorithm to use multi- 
level continue’s. 

Ramshaw’s (extended) algorithm replaces each 
forward goto with a break and each backward 
goto with a continue. His algorithm inserts a loop 
that ends just before the target of each break state- 
ment. Likewise, it inserts a loop that starts just 
before the target of each continue. These loops 
ensure that each control-transfer statement jumps 
to the correct instruction. Each newly-inserted 
loop must also end with a break statement, so 
that control will fall out of the loop. Figure 4 
shows an example of this technique. Additional 
loops and break/continue’s create a structured 


program with exactly the same control flow as the 
goto-only program. 

Ramshaw’s algorithm requires two inputs: the 
control flow graph, and an instruction ordering. His 
algorithm encodes this order into the flow graph 
using augmenting edges, such that every instruc- 
tion has an augmenting edge to the next instruc- 
tion in sequence. These augmenting edges occur 
between every pair of physically adjacent instruc- 
tions even if actual control flow between them is 
impossible. He proves that if this augmented graph 
is reducible, then a structurally equivalent [PKT73] 
program can be created without goto’s. How- 
ever, Ramshaw provides no algorithm for finding 
a reducible augmented flow graph from a given re- 
ducible flow graph. 

The control-flow graphs of Java programs are 
reducible. Therefore, the compiled bytecode will 
likely form a reducible control-flow graph. Unfor- 
tunately, simple optimizations like loop inversion 
create irreducible augmented flow graphs. The flow 
graph of the program in Figure 8 has this problem 
because the augmenting edge between the first two 
statements creates a “jump” into the body of the 
loop formed by the next seven statements. 

To utilize Ramshaw’s algorithm, we developed an 
algorithm that orders a reducible graph’s instruc- 
tions such that the resulting augmented graph is 
also reducible. 


3.1 Augmenting the Flow Graph 


Creating a reducible augmented flow graph re- 
quires that no augmenting edge enters a loop any- 
where other than at its header. Preventing this 
is simple—when ordering the instructions, make 
the header first and contiguously order the loop’s 
instructions. Because physical adjacency deter- 
mines augmenting edges, contiguously ordering the 
instructions guarantees that the only augmenting 
edge entering the loop from the outside will be en- 
tering at the top, which will not affect reducibility 
if it 1s the loop’s header. 

A loop with no nested loops inside is easy to 
order—simply remove the back edges and topo- 
logically sort the remaining directed acyclic graph 
(DAG). Handling interior loops requires replacing 
them with a single placeholder node in the graph 
and separately ordering both the loop and the sur- 
rounding graph. After ordering both, re-insert the 
loop’s nodes at its placeholder. MRe-ordering in- 
structions may change whether or not one instruc- 
tion falls through to another as it did in the original 
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Bytecode Symbolic Stack Emitted Source 
aloadO "this" / 
getfield #3 <Field foo.sam I> | "this.sam" 
lload_i "this.sam", "ii" 
if_icmple 12 if (this.sam <= il) goto L12 
i1load_i ele 
iconst_2 eee ee 
imul Mal Due 
istore.2 i2 = (i1*2) 

Li2: 





ireturn 


12443 ilo0oad_2 ae 












return 12 





Figure 3: Symbolic Execution of Bytecode 


stmt0 

if exprl1 goto L1; 
if expr2 goto L2; 
Ll: stmtl 

L2: stmt2 


stmt0 
2 torts yet 
els for (42)4 
if erprl break L1; 
if expr2 break L2; 
break L]1; 
}//U 
stmt] 
break L2; 
} // 12 


stmt2 


Figure 4: Ramshaw’s Goto Elimination: Before and After 


ordering. Where implicit control flow has changed, 
the algorithm must add new branches to restore 
the original control flow. Whenever possible, the 
topological sort attempts to maintain the original 
fall-through behavior. 


This algorithm produces a reducible augmented 
graph. Because all loops are ordered separately, 
and laid out contiguously, the only augmenting 
edge entering from outside enters at the top. The 
topological sort of the loop (minus its backedges) 
guarantees that this top node is the loop header and 
that no internal edges cause irreducibility. Outside 
edges into the loop header cannot make a loop irre- 
ducible. Therefore, the resulting augmented graph 
is reducible. 


Loops are not the only blocks of instructions 
which must be ordered contiguously. Exception 
handling regions must form contiguous sections of 
instructions. Class files specify which instructions 
are in which regions. Our algorithm orders those 
regions contiguously by treating them like loops. 


After applying this technique to create a total or- 
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dering of the nodes (the augmenting path), Kraka- 
toa can apply Ramshaw’s technique to eliminate 
goto’s. 


4 Code Transformations 


4.1 Program Points 


After applying Ramshaw’s algorithm for eliminat- 
ing goto’s, Krakatoa has a complex, yet legal, Java 
AST (see Figure 9). Krakatoa then proceeds to 
recover more of the natural high-level constructs 
of the original program (e.g. if-then-else, etc.). 
Krakatoa uses a program point analysis to summa- 
rize a program’s control-flow and to guide recover- 
ing high-level constructs. A program point 1s a syn- 
tactic location in a program. Every statement has a 
program point both before and after it. These pro- 
gram points have two properties: reachability and 
equivalence class. 

A program point is unreachable if and only if it is 
preceded along all execution paths by an uncondi- 


USENIX Association 


USENIX Association 


tional jump statement (i.e. return, throw, break, 
or continue). For instance, in Figure 5, program 
point 3, 3, is unreachable, since it is preceded by a 
return statement. ®g is reachable, however, since 
one of the branches in the preceding if statement 
does not end with a jump statement. 

Two program points are equivalent (denoted as 
®, x ®,) if and only if future computation of the 
program is the same from both points. For in- 
stance, the program point before a break state- 
ment is equivalent to the program point after the 
loop it exits (@3 and ®g in Figure 6). As an ex- 
ample, in Figure 6, ®;, ®2, 4, 5, Bg, and 7 are 
equivalent, as are program points ®3 and ®g. 

Both reachability and equivalence are simple 
to compute via standard control-flow analyses 


[ASUB6]. 


4.2 AST Rewrite Rules 


Krakatoa performs a series of AST rewriting trans- 
formations to recover as many of the “natural” pro- 
gram constructs as it can (e.g. if-then-else, etc.). 
Krakatoa applies these rewriting rules repeatedly 
until no changes occur. We have found that the 
few rules below are sufficient to retrieve high-level 
constructs of the Java language, including if-then- 
else statements, and short-circuit evaluation of ex- 
pressions. Each rewriting rule reduces the size of 
the AST, thus ensuring termination. 

Table 1 summarizes the rules, which we describe 
below in greater detail. Many of these rules gen- 
eralize. Those that apply to for-loops often apply 
to other loops. Many rules have several symmetric 
cases, For example, the first rule in Table 1 re- 
moves an empty else-branch from an if-then-else 
statement—there is a symmetric rule for removing 
an empty then-branch by negating the predicate. 


4.3 if-then-else Rewriting Rules 


The first transformation shown in Table 1 changes 
an if-then-else statement into an if-then state- 
ment when the else branch is empty. This trans- 
formation is always legal. 

The second transformation creates an if-then- 
else statement from an if-then statement by hoist- 
ing the subsequent statement list into the else-part. 
Our algorithm performs this transformation if and 
only if no reachable program point in Stmitlist/ is 
equivalent to the program point before Stmtlist2. 
Essentially, this means that no statement in the 


then-branch (Stmélist/) can reach Stmtlist2 di- 
rectly. 


4.4 Loop Rewriting Rules 


The third rule in Table 1 removes useless continue 
statements. If the program point after a continue 
statement is equivalent to the program point before 
the continue statement, then that continue can 
be removed. 

The fourth rule creates a short-circuit test ex- 
pression within a for-loop by eliminating an inte- 
rior if statement. Doing so requires that the loop 
body begin with an if-then-else statement, and 
that the then branch of that statement consists 
of a single jump to a program point equivalent to 
breaking out of the loop. 

The fifth transformation provides an example of 
transforming loops into if statements. A loop is 
equivalent to an if if it can never repeat itself, and if 
all simple break statements can be safely removed 
during the transformation. A loop never repeats 
if its last program point is unreachable. break’s 
may be removed if the immediately following (un- 
reachable) program point is equivalent to the last 
program point in the loop (®, in Table 1). The 
transformation replaces the loop with an if state- 
ment, and deletes all of the break statements for 
that loop. 


4.5 Short Circuit Evaluation 


Rewriting Rules 


The sixth rule shown in Table 1 recovers a short- 
circuit Or conditional. Short-circuit Or’s exist 
when two adjacent conditionals guard the same 
statement list and failure of either will cause a 
branch to equivalent locations. 

The last transformation in Table | recovers short- 
circult And expressions. This transformation is ap- 
plicable whenever a simple if statement represents 
the entire body of another. 


5 Status 


We have implemented a prototype Java decompiler, 
Krakatoa, in Java. We have run Krakatoa on a 
number of class files, including some to which we 
had no source code access. We examined the output 
of Krakatoa by hand, and Krakatoa appears to re- 
cover high-level constructs very well. Figures 7-10 
provide an example of the stages of decompilation. 
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Create oe } else { reachable program 
if-then-else } Stmtlist2 points equivalent to ®,. 


®, : Stmtlist2 } 


| for (A ;B ;C) { ; 
ne f A;B :C 
Delete Stmtlist aa eae. { ©, & Ds 
Continues ®, continue ®, } 
} 
for (A ;B SC ) { 
RE for (A ; 
| 1 \ sega B and not ezpr ; 
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for ( stmt ; expr ; ) { | Stmtlist contains no 
















Stmtlist stmt reachable program 
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: Xx 
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Create Short 
Circuit And’s } 







Table 1: Canonical Code Transformation Rules 
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®; 
if(a<b){ 

i) 

return a; 

®3 // (unreachable) 
else { 

D4 

a "bs 

D5 


Figure 5: Reachable Points 


Figure 7 shows the original source code of a sam- 
ple program. Figure 8 shows the results of expres- 
sion decompilation on the bytecode of this program. 
Figure 9 shows the results of applying Ramshaw’s 
algorithm to the decompiled expression graph. Fig- 
ure 10 shows the result of the grammar rewrit- 
ing rules applied to the output of Ramshaw’s al- 
gorithm. Obviously, using DeMorgan’s laws would 
simplify the boolean expressions. Future versions 
of Krakatoa will do so. 


For the JVM dup operators, which duplicate 
stack elements, Krakatoa simply creates a tempo- 
rary variable to hold the duplicated value. This 
yields unnatural, but easily readable, decompila- 
tions. A more difficult problem is our failure to 
recover the conditional-expression operator, “? :”. 
This operation presents two difficulties: it requires 
determining short-circuit operators during expres- 
sion recovery, and it requires that expression recov- 
ery handle non-empty stacks at basic block bound- 
aries. Fortunately, the short-circuit problem can be 
handled easily with four simple graph-writing rules 
given in [Cif93]. The non-empty stack problem is 
difficult because it requires combining expressions 
in our symbolic stack upon entering a basic block 
with multiple predecessors. Krakatoa again uses a 
temporary variable to hold the result of each branch 
of the conditional expression, and then assigns this 
temporary value to the conditional expression. We 
are currently investigating other solutions to this 
problem. 
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®, // { Bo, Ba, Bs, Be, D7 } 


for (;;) { 
D2 
if (-a-< bs 
b3 // { Bs } 
break; 
®, // (unreachable) 
else { 
Ds 
continue; 


®g // (unreachable) 


®7 // (unreachable) 


Figure 6: Equivalent Points 


Appendix B contains additional examples of 
Krakatoa’s output. 


6 Countermeasures 


Krakatoa is very effective at reproducing readable 
Java source from Java bytecode. This may be 
alarming to those who want to protect their source 
code from unwanted copying. Unfortunately, there 
are few countermeasures. 

One could introduce irreducible control-flow 
through bogus conditional jumps to foil Ramshaw’s 
algorithm. This, however, only stops the recre- 
ation of high-level constructs. Krakatoa could sim- 
ply produce source code in a Java-like language ex- 
tended with goto’s. 

One could introduce bizarre stack behavior to foil 
expression recovery. This is difficult, however, be- 
cause the behavior cannot be so bizarre as to yield 
unverifiable bytecode. It is possible, however, to 
create many bogus threads of control (i.e., threads 
that will never execute) that will confuse the ex- 
pression recovery mechanism in basic blocks that 
are entered with non-empty stacks. 

One code obfuscation technique that is modestly 
effective is to change the class file’s symbol table to 
contain bizarre names for fields and methods. So 
long as cooperating classes agree on these names, 
the class files will link and execute correctly [vV96, 
Sri96]. 


Another suggested solution is to use dedicated 
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class foo { 
void foo(int x, int y) { 
while ((x + y < 10) && (x > 5)) { 
u(y > x) | {y= 100) 


Re 


else { 
x += 100; 
} 
} 
} 
} 


Figure 7: Original Source 


class foo { 
void foo(int il, int 12) { 
Ip3: for (5; ) { 
if ((il + i2) >= 10) break Ip3; 
if !((il > 5)) break Ip3; 
Ip2: for(;;) { 
Ips. “fore ){ 
if (i2 > il) break Ip1; 
if {((i2 >= 100)) break Ip]; 
break Ip2; 
} // Ipl 
1h = aAy: 
continue Ip3; 
} // \p2 
il += 100; 
continue Ip3; 
} // Ip3 
return; 
} 
} 


Figure 9: After Goto Elimination (Ramshaw’s 


Algorithm) 
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class foo { 
void foo(int il, int i2) { 
goto L4; 
L1: if (i2 > il) goto L2; 
if (12 >= 100) goto L3; 
C2echk 12; 
goto L4; 
L3: 11 += 100; 
L4: if ((11+12)>=10) goto L5; 
if (il > 5) goto L1; 
L5: return; 
} | foo 
} // foo 


Figure 8: After Expression Decompilation 


class foo { 
void foo(int il, int i2) { 
Ip3: — for ( ;!((i11+i12)>=10)&&((i1>5)); ) { 
if (i2 > il) || !((i2 >= 100)) { 
i= 12; 


} 
} // \p3 


return; 


} 


Figure 10: After AST Transformation (Final De- 
compilation Results) 
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hardware and encryption to protect class files 
[Wil97]. 

Many traditional countermeasures to reverse- 
engineering will not work for Java bytecode. It is 
impossible to mix code and data. It is impossible to 
jump to the middle of instructions. It is impossible 
to generate bytecode and then jump to it. 


7 Related Work 


Ramshaw presented a technique for eliminating 
goto’s in Pascal programs by replacing them with 
multilevel break’s and surrounding loops [Ram88]. 
He made no attempt to recover high-level control 
constructs. All high-level control structures were 
provided by the original Pascal. 

Several decompilation systems have used a se- 
ries of graph transformations to recover high-level 
constructs [Lic85, Cif93]. These systems encounter 
difficulties in the presence of nested loops, and 
other arbitrarily control flow. Multilevel break’s 
cause considerable problems. Exception handling 
introduces another difficulty to such systems, as 
the control flow graph can be entered in several 
places. Krakatoa easily creates multi-level break’s 
and continue’s, and is able to eliminate virtually 
all of the unnecessary ones via successive applica- 
tion of the rewrite rules. 

“Mocha” (version 1 beta 1) [vV96] is a Java de- 
compiler written by Hanpeter van Vliet. Mocha 
uses graph transformations to recover high-level 
constructs. Mocha often aborts when it confronts 
tangled—yet structured—control flow (including 
multi-level break’s and continue’s). The system 
does issue type declarations, and uses debugging in- 
formation (when present) to recover local variable 
names. 

Other graph transformation systems used node- 
splitting to transform an unstructured graph to a 
structured graph [WO78, PKT73, Wil77]. Peter- 
son, Kasami, and Tokura present a proof that every 
flow graph can be transformed into an equivalent 
well-formed flow graph. Williams and Ossher use a 
similar technique, but they recognize five unstruc- 
tured sub-graphs, and replace those with equivalent 
structured graphs. Node-splitting preserves the ex- 
ecution sequence of a program, but not the struc- 
ture. We do not consider this reasonable for de- 
compilation. 

Baker presents a technique for producing pro- 
grams from flow graphs (Bak77]. Baker gener- 
ates summary control flow information to guide her 


graph transformations. Our goal is similar, since 
the output of the decompiler should be as readable 
as possible. Her technique structures old FOR- 
TRAN programs for readability. As a result, her 
technique may leave some goto’s in the resulting 
programs, which is not allowed in Java. 

Other techniques for eliminating goto’s have 
been proposed [EH94, Amm92, AKPW83, AM75]. 
These techniques may change the structure of the 
program, and may add condition variables, or cre- 
ate subroutines. 


8 Conclusion 


In this paper, we present a technique for decom- 
piling Java bytecode into Java source. Our decom- 
piler, Krakatoa, produces syntactically legal Java 
source from legal, reducible Java bytecode. We fo- 
cus on two subproblems of decompilation: recov- 
ery of expressions from Java’s stack-based byte- 
code, and recovery of high-level control-flow con- 
structs. We present our stack simulation method 
for recovering expressions. We present an extension 
of Ramshaw’s goto elimination technique that can 
be applied to any reducible control-flow graph. 
We also present a small, yet powerful, set of code 
rewriting rules for recovering the natural high-level 
control-flow constructs of the Java source language. 
These rewrite rules enable Krakatoa to successfully 
decompile many class files that graph transforma- 
tion systems fail. If Krakatoa is presented with 
a high-level language idiom that it does not rec- 
ognize, it may leave unnecessary breaks or con- 
tinues in the code. It will still produce legal Java, 
however. If a system relies on a graph transforma- 
tion system to produce high-level constructs, it will 
fail when presented with an unexpected construct. 
Our techniques, combined with the abundant 
type information available in class files, make de- 
compilation of Java bytecode quite effective. 
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A Additional Rewriting Rules 


We anticipate using a few other tree rewriting rules 
that might improve readability of our code. The 
anticipated rules build more natural for-loops. Ta- 
ble 2 presents addition code transformation rules 
that could be applied by Krakatoa. We expect 
to add these rules as we re-implement Krakatoa in 
Java. 


B Sample Decompiler Output 


We’ve included a representative sampling of 
Krakatoa’s output on a classfile that implements 
sets in Java. The original Java source is on the left 
and Krakatoa’s output is on the right. Table 3 pro- 
vides original source class definitions as well as the 
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Before 


Include for (; expr ; update ) { 


Init Stmtlist 


} 
for (init ; erpr; ) { 
| Include Stmtlist 
Update U 
®, } 


for (init ; erpr ; U ) { 


for (/ ; expr ; update ) { | 


Stmtlist I is a simple statement 


“Stmtlist contains no 
reachable program 
points equivalent to ®,. 
U is a simple statement 


Stmtlist 





Table 2: Additional Code Transformation Rules 


Original Source 


import java.io.PrintStream; 
import java.util.Vector; 


public class Set 
implements Cloneable { 


// class variables 
static boolean echo_ops; 


// instance variables 
protected Vector members; 


// functions are defnied here.... 


Output from Krakatoa 


import java.io.PrintStream; 
import java.util.Vector; 


public class Set 
extends java.lang.Object 


implements java.lang.Cloneable { 


static boolean echo_ops; 


protected java.util.Vector members; 


// functions are defined here... 


Table 3: Class definition output from Krakatoa 


corresponding Krakatoa output. Table 4 provides 
original source of several small functions together 
with Krakatoa output for those functions. Table 5 
shows a larger function in original source as well as 
Krakatoa output for that function. 
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| Original Source : | Output from Krakatoa ] 








public boolean isMember(Object o) { public boolean isMember ( 
java.lang.Object locali) { 
return (members.contains(o) ); return this.members.contains(local1) ; 


} // isMember } // isMember 






public void addMember( 
java.lang.Object local1) { 


public void addMember(Object o) { 






if ('(isMember(o))) { if '( (this.isMember(locali) '= 0) ) { 
members.addElement(o) ; this .members.addElement(local1) 
} // then } // then 
return; 
} // addMember } // addMember 
public void removeMember(Object o) { public void removeMember ( 


java.lang.Object locali) { 


members. removeElement(o) ; this.members.removeElement(local1) 
return; 
} // removeMember } // removeMember 
public int size() { public int size() { 
return members.size(); return this.members.size(); 
+} // size } // size 





boolean equals(Set locali) { 


boolean equals(Set s) { Set local2; 
Set. dt: <d2; Set local3; 
di = difference(s); local2 = this.difference(locall1) ; 
d2 = s.difference(this) ; local3 = locali.difference(this) ; 


if !(((local2.size() != 0) || 
'((local3.size() == 0)))) { 
return 1; 
return ((di.size() == 0) && } // then 
(d2.size() == 0)); else { 
return 0; 
$ Po) Ae 
+} // equals 


Table 4: Member Functions: Original Source and Krakatoa output 
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[ OriginalSource ~~~~—~—~—S«<| ~—~—~—~—~S~S™~™~C tpt from Krakatoa. Ss 


// This returns a NEW set, with all of 
// the elements from this set and 


// Set s. 
public Set union(Set s) { public Set union(Set local1) { 
Set out; Set local2; 
int size; int local3; 
int: 2% int local4; 
Object obj; java.lang.Object local5; 
if (echo_ops) { | if !( (Set.echo_ops == 0) ) { 
System.out.println("unioning") ; java.lang.System.out.println("unioning") ; 
} } // then 
out = new Set(); | local2 = new Set(); 
local2.members = ((java.util.Vector) 
this.members.clone()); 
out.members = (Vector) members.clone();| local3 = locali.size(); 
local4 = QO; 
size = s.size(); loop3 : 
for (= H€i(local4:< locals): +). 4+ 
for (i = 0; i < size; itt) { locals = 
obj = s.members.elementAt(i) ; locali.members.elementAt(local4) ; 
if (!(out.isMember(obj))) { if !((local2.isMember(local5) !=0)) f{ 
out .addMember (obj); local2.addMember(local5) 
} // then } // then 
local4 += 1; 
7/7 for /7 100p3 
return out; return local2; 
} // union } // union 


Table 5: Member functions: Original Source and Krakatoa output 
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Nataraj Nagaratnam, Dept. of ECE, Syracuse University 
Steven B. Byrne, JavaSoft, Inc., Sun Microsystems 
email: { nataraj@cat.syr.edu, sbb@eng.sun.com} 


Abstract 


The rapid increase in the Internet’s connectiv- 
ity has lead to proportional increase in the devel- 
opment of Web-based applications. Usage of down- 
loadable content has proved effective in a number 
of emerging applications including electronic com- 
merce, software components on-demand, and collab- 
orative systems. In all these cases, Internet user 
agents (like browsers, tuners) are widely used by the 
clients to utilize and ezecute such downloadable con- 
tent. With this new technology of using download- 
able content comes the problem of the downloaded 
content obtaining unauthorized access to the client’s 
resources. In effect, granting a hostile remote prin- 
cipal the requested access to client’s resources may 
lead to undesirable consequences. Hence tt 1s impor- 
tant for the browsers to provide a framework such 
that the user can fine tune his system according 
to his trust relationship with the content authors. 
Currently available systems either do not allow the 
downloaded content to access any of the local re- 
sources or allows all the contents to have the same 
privileges. IN this paper, we present the design and 
implementation of a model that provides resource ac- 
cess control of a finer granularity for an user agent. 
Using our model, the client will be able to selectively 
grant access to resources based on a trust relationship 
with the principal, who has certified the authenticity 
of the contents. 


1 Introduction 


The ever-expanding nature of the Internet and 
the World Wide Web poses new problems such as 
scalability, standard naming scheme, and security. 
Nowadays it is becoming increasingly common to 
download some active content over the untrusted 
Internet and execute it on a client machine. This 
downloadable content can be Java [1] applets, Cas- 
tanet [3] channel’s contents or component objects 
like JavaBeans [2], and other executables. With the 
wide acceptance of object-oriented technology in ev- 
ery aspect of engineering, it is also common to en- 
vision all such content on the Web to be objects, 


accepting messages and providing the necessary ser- 
vices. Designing a scheme to protect client machines 
from hostile applets and components has become a 
necessity. Such a scheme should also provide the 
user the ability to selectively allow trusted contents 
to be downloaded and executed. 

Protecting the client machine from hostile ap- 
plets can be considered equivalent to providing a 
controlled access to a (client) system’s resources. 
Devising such scheme calls for defining whom the 
client trusts, to identify the source of such down- 
loadable content, verifying that the principal certi- 
fying the content (identity) is the same as it claims 
to be. This scheme should be flexible enough for 
the user to customize it to his security needs. The 
flexibility is now a requirement given the classifica- 
tions of the network as Internet, corporate Intranet 
and Extranet (a domain consisting of an Intranet 
and multiple trusted client sub-domains). As more 
Intranet applications are developed it is common to 
assume that all such applications and downloadable 
content originating within the Intranet domain can 
be equally trusted. In this paper, we present the de- 
sign and implementation of such a scheme for usage 
in user agents like browsers such that the restriction 
of access to resources in the name of high-security 
does not prohibit the users from using downloadable 
contents such as applets. 


1.1 Motivation 


The Internet has proved to be an effective data 
distribution medium, especially for software. The 
concept of downloadable content, where the soft- 
ware component (or the software itself) can be down- 
loaded on-demand from the provider’s (server) ma- 
chine and executed on client machine, adds a flavor 
to this medium. From the point of view of soft- 
ware distributors, a new version of the software can 
just be installed on a server from where a client 
can download it or, in the case of Castanet [3], the 
tuner will automatically download updates in chan- 
nels from transmitters. At the same time, a client 
can be assured of obtaining the latest version. The 
important aspect that affects such a developer-client 
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relationship over such an open medium like Internet 
is the varying trust relationships between any such 
pair. One way to protect client’s resources is to pre- 
vent any (and all) downloadable content from ac- 
cessing any of the client’s resources. This is exactly 
the default policy enforced by the Java runtime to 
protect client machine from being attacked by all ap- 
plets, the so called “sandbox security model.” The 
user-agents (like browsers, tuners, etc.) incharge of 
downloading the content over the net and executing 
it on the client machine, widely adopt this default 
policy and hence restrict any applet from accessing 
the client’s resources. Though this provides a solu- 
tion for preventing hostile applets from attacking a 
client machine, its inflexibility prohibits the client 
to grant access to trusted applets. Hence, there is a 
high demand for a flexible mechanism for user agents 
(browsers) to serve the entire spectrum of trust re- 
lationship, varying from completely trusted Intranet 
contents to highly-untrusted Internet contents. Such 
a demand has been the motivation behind the mod- 
eling of the flexible system described in this paper. 


1.2 Infrastructure 


Our model is general enough to be applied to 
any environment of user agents and downloadable 
contents. Our implementation is tailored to the Java 
environment, as it is becoming the de facto standard 
for deploying Web-based applications. The basic in- 
frastructure over which our system is built is the se- 
curity framework of the JavaSoft’s JDK1.1 release. 
We have taken advantage of digitally signed applets 
(which establishes an identity to base our trust on), 
the public-key key cryptography based mechanism 
for such exchange of contents, and the ACL (Access 
Control List) framework to associate a list of trusted 
identities with any object the client is trying to pro- 
tect and Java’s Sandbox security model. 

The rest of the paper is organized as follows. 
Section 2 describes related work. An overview of the 
Java’s sandbox security model is provided in Section 
3. Definition of the components in our model is pro- 
vided in Section 4. The specification details (iden- 
tities, groups) required in our model is covered in 
Section 5. Description of our access control model 
is in section 6 followed by details of our trust pol- 
icy over which we base our decision and the way in 
which we specify such policy, in section 7. Section 8 
concludes this paper. 


2 Related Work 


Netscape Navigator 3.01 prohibits any applet 
downloaded over a network from accessing local files. 
Only those applets that reside on client machine 
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which are accessible through the CLASSPATH (i.e. 
the content server is the client itself) can access the 
local files. The applets loaded over a network can 
reestablish network connections to only the site from 
which they were downloaded. Any other network 
connection to other sites is prohibited. This inflex- 
ible mechanism does not provide a way for users to 
fine tune their browser to allow trusted applets to 
access the local resources. 

The HotJaval.0prebetal web browser provides 
a little flexibility to users in controlling accesses to 
local files. HotJava has encapsulated many parame- 
ters as properties (< name, value > pairs) that can 
be configured by the user. These properties take ef- 
fect when the browser is first invoked and changes 
to certain properties will be dynamically absorbed. 
Among those, properties of interest are the acl. read 
and acl.write. The value these properties take is a 
list of file (or directory) names. Specifying file (or 
directory) names indicate to the system that any 
applet run by the browser can read the files (or di- 
rectories) listed in the acl.read and write to those 
listed under acl. write property. This means, either 
all applets read/write to a file/directory or none of 
the applets can. 

Safe Tcl [4] consists of two interpreters: trusted 
interpreter and untrusted interpreter. The trusted 
interpreter provides access to the client machine’s 
resources whereas the access is prohibited in the un- 
trusted interpreter. The idea behind the SafeTcl is 
to run trusted code in trusted interpreter and un- 
trusted one in the other. It lacks authentication and 
so all content is to be assumed to have been down- 
loaded from untrusted sources. 

The Telescript engine [5] uses credentials and 
permits for access control. The credentials estab- 
lishes the identity of the principal responsible for 
the creation of the downloadable content. The per- 
mit is like a capability which grants access rights to 
other (including downloading client) principal’s re- 
sources whereas the client can deny the right granted 
by the permit. Also permits do not have the scope 
of resource restrictions that we provide. 

Cryptolopes (cryptographic envelopes) [6] pro- 
vide a mechanism for protecting the content from 
hostile hosts. The client negotiates to access the 
content with the server. It helps providing security 
to the content whereas our system protects the client 
from the content. 

Abadi et al. [7] present a calculus for access 
control from logical perspective. They have provided 
a logical language for access control lists and theo- 
ries for making decisions on granting access requests. 
They have dealt with roles, by treating roles as a 


USENIX Association 


USENIX Association 


composite principal which acts “as” the role (usu- 
ally with reduced rights). In our system, we have 
not dealt with roles at this point. We plan to work 
on these extensions in future. 

Jaeger et al. [10] describe an architecture for 
access control of downloaded content. Their archi- 
tecture allows access of resources by downloaded 
content in a controlled manner. They map a remote 
principal to a principal group and determine the ac- 
cess rights. The four categories of principals they 
consider are: downloading principals, remote prin- 
clipals, applications developers and system adminis- 
trator. Individual principals are aggregated into a 
principal group if they have the same rights. Such 
group rights are used to determine the rights of each 
individual principal. In our design, we define ac- 
cess to a resource using an ACL. This ACL is a 
set of < principal, permissions > pairs. Thus in 
our method, same ACL can be used to define ac- 
cess rights for other resources. This increases the 
flexibility and reusability of ACL definitions. 


3 Java Security Model 


The implementation of our access control 
model is for a Java-enabled user agent like Hot- 
Java. Understanding the underlying security model 
ls necessary to successfully augment advanced secu- 
rity features. As we are concerned with the security 
of a client system’s resources against a downloaded 
content (applets), we will describe the Java’s sand- 
box security model on which our implementation is 
based. 


3.1 Security Reference Model Specifications 


The Java Security Reference Model [13] defines 
an applet to be an executable Java program that 
is downloaded from the server. Also, applet load- 
ing and security is under the control of the appli- 
cation. Hence, defining our security policy for the 
browser is needed to provide necessary security en- 
hancements as far as downloaded applets are con- 
cerned. The model defines a set of security interac- 
tions between Java components namely applet, ap- 
plication, Java virtual machine (JVM), client-server 
platforms and the server itself. Among those inter- 
actions, the Applet Access Device Attempt(AADA) 
ls Important to our model. According to AADA, an 
applet may attempt to call a method within the ap- 
plication (browser), such as an access to the local 
file system, or display. The application’s Security 
Manager policy mediates the requested access. The 
invariant in the AADA is that the application al- 
ways calls the Security Manager object to see if the 
requested access is permitted. This model (the sand- 


box model) in which access to resources goes through 
a security manager object of the application, helps 
to run untrusted code in a trusted environment and 
still ensure that the applet cannot damage the local 
machine. 


3.2 Sandbox Security Model 


Users can import and run applets from the 
Web or an intranet without damaging the client ma- 
chine. Such an applet’s actions are restricted to 
the ’sandbox”, which is an area dedicated by the 
browser to that applet. The applet cannot access 
any resources beyond the sandbox. This helps users 
ro run any (even untrusted) code and still ensure 
protection of their resources from attacks. The scope 
of such a sandbox 1s left to be defined by the browser. 
Our work presents a model to expand this sandbox 
to an extent user desires i.e., user should be able to 
define the scope of this sandbox depending upon the 
remote principals certifying downloadable contents. 

According to the sandbox model, a security 
manager object serves as an access-approving au- 
thority within the application. Any attempts to ac- 
cess to resources, go through the security manager 
which in turn grants or denies access. The security 
manager is an object that is a subclass of the class 
SecurityManager. When an access is denied, the se- 
curity manager throws a security exception. 

Currently, the existing browsers don’t have the 
flexibility to selectively allow access to selected ap- 
plets. Our work fills this gap by defining a model 
to specify trust, resources that can be accessed on 
a per-applet basis, and necessary extension of the 
SecurityManager class to achieve the desired flexi- 
bility. 


3.3. Establishing Trust 


A mechanism to authenticate applets is neces- 
sary to define trust based on where the applet comes 
from. Digital signatures [11, 12] based on public-key 
cryptosystems come to the rescue. If an applet au- 
thor can sign his applet, then the client can verify 
his signature and take necessary action: either deny 
or allow resource access requests. The JDK provides 
necessary framework for signing class files. To sign 
an applet, the author can bundle all Java code (class 
files) into a single Java archive file called a JAR file. 
Based on his private key and the contents of the JAR 
file, the author generates a digital signature block. 
On the client side, the security manager can resolve 
authentication issues by using the digital signature 
mechanism. Once the code is authenticated, then 
it can take the right decision based on user’s access 
control specification. 
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Java provides the sandbox security model 
along with the mechanism for authentication using 
digital signatures. Our design is based on these 
available facilities in the Java security framework. 
In the next subsequent sections we will describe how 
we use this framework to help user specify trust, re- 
source access control and how these specifications 
are interpreted by our security manager in effectively 
controlling access to the resources. 


4 Model Components 


Protecting client machines from malicious 
downloadable content is the objective of our system. 
Before we model a system that would achieve this, 
we need to define what we are trying to protect (re- 
sources) and from whom (principals certifying the 
contents). In this section we will define the granu- 
larity of such resources and varied categories of prin- 
cipals. 


4.1 Resources 


The resources on the client machine that need 
to be protected includes from files, directories on lo- 
cal disk, network connections, CPU usage, memory 
usage and access to the display. The resources can 
also be extended to non-physical components like re- 
mote objects, components like Java beans, etc. In 
our implementation, we will illustrate the protec- 
tion of files and restricting network connections from 
Java applets. Effective application of remote objects 
(9, 8] may involve method invocations by download- 
able content on objects residing in a client machine. 
In such cases, the user might be interested in pro- 
tecting the object from being invoked or accessed by 
other hostile objects/applets. 


4.2 Principals 


When the issue of access control is raised, along 
with that comes the question of whom to trust. Im- 
plicitly there is an association of contents to some 
principal responsible for (creation or certification by 
digitally signing) that content. Such a principal can 
be another user, a company, a host, or a group of 
such entities. In an open distributed environment 
like the Internet, it is not impossible to impersonate 
other principals. This will lead to the user giving 
access to his resources to a principal, who is actu- 
ally not whom he claims to be. Strong authentica- 
tion is necessary in such an environment. The basic 
requirement for authentication is to define who prin- 
cipals are and how they can be authenticated. 

In our model, principals can be individual 
users, companies, or hosts. With each of these prin- 
cipals, there should be a < public, private > key 
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pair associated. Using public key crypto techniques 
we can authenticate the principal. The notion of a 
principal can further be extended to groups of such 
principals. Assuming the existence of a name space 
to resolve identities of these principals, we can con- 
firm their identities. Those principals can establish 
their identity along with the content they have devel- 
oped, by signing them using digital signatures. We 
will describe the syntactic specification of resources 
and principals, as in our model, in the next section. 


5 Specification of Principals 


A standard format is required to specify prin- 
cipals and resources in any of the configuration files. 
Resources need to be specified only when its cor- 
responding access control list (ACL) is configured. 
An ACL is a list of < principal, permissions > 
pairs. Hence, principals need to be specified during 
formation of ACLs. Principals in an ACL can be 
individual users, hosts or group of these. Individ- 
ual principals can be specified using their associated 
names (eg., Nataraj, syrResearch, SyrUniv or dia- 
monds Team for identities and ratnam.cai.syr.edu or 
cat.syr.edu for host name or domain name specifi- 
cation). Identity names are unique (we assume the 
existence of a global name space) and can be spec- 
ified as such. Hence, a identity name can be name 
of individual identity like Nataraj or an identity of 
a team like syrResearch or a company or a body of 
companies and so on. In these cases, even though 
an identity might be a collection of other individual 
entities, it by itself is considered an identity. This 
notion of an entity representing a set of identities 
is different from groups of identities. A group is 
a set of identities sharing some common property. 
Each of those identities are called members of that 
group. A member can represent a group by sign- 
ing for the group. But in the above case like re- 
searchTeam, though its a set of individual identities, 
it is a principal by itself having its own key pair. For 
such principals, other authorized principals can sign 
as the group. 

Groups are sets of principals. Principals can be 
identities, hosts or other groups. Groups are speci- 
fied separately in our system. They are specified in 
the format 


<groupName>=<identityName>[,<identityName>] + 


hence, following is an example of valid group speci- 
fication: 


syrResearch=Nataraj ,Doug,Paul 
diamondsTeam=Gary , Doug, Nataraj 
syrHosts=cat.syr.edu,ece.syr.edu 
catHosts=ratnam.cat.syr.edu,lynx.cat.syr.edu 
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The underlying ACLParser object (an instance 
of the sun.hotjava.security.ACLParser, responsi- 
ble for parsing the ACLs specified in the pre- 
determined format) parses this information and 
populates the ACLZManager (an instance of the 
sun.hotjava.security.A CLManager, which maintains 
ACLs, policy database and the decision making au- 
thority for granting access). This is stored effectively 
like a lookup table as illustrated in Table 1. Any of 
the specified principals in an ACL should be regis- 
tered in the user’s identity database. An identity DB 
is created using the javakey utility of the JDK1.1. 
This utility helps specify the identity and how it is 
trusted and so on. This utility manages a database 
of entities (people, companies, etc.) and their keys 
(public and private) and certificates. This tool also 
generates signatures for JAR files and verifies those 
signatures [14]. It can be used by aclient to declare 
whether or not it trusts certain entities. 

The principals can also be specified using reg- 
ular expressions. Hence, the following is also a valid 
specification. 


allSyrHosts=*.syr.edu 


Using a combination of regular expressions and other 
groups, new groups can be formed with ease. This 
makes the specification of principal groups easier. 


6 ACL-based Model 


Our model is based on associating access con- 
trol list (ACL) with resources. We create named 
ACLs and associate them with resources. An ACL 
Is associated with each resource to guard it and ACL 
itself is independent of the resource it guards. So a 
< resource, AC'L > pair means that the principals 
have the corresponding permissions on the resource. 
An ACL can be (re)used to guard more than one 
resource. The relationship between resources and 
ACLs are depicted in Figure 1. In this design, the 
key is the resource name. When a principal tries 
to access a resource, the system consults the con- 
figuration and obtains the ACL associated with the 
resource. It then checks the ACL to see if the princi- 
pal under consideration has the required permission. 
If so, the system allows the principal to access the 
resource. Otherwise, it denies the attempted access. 

In this system, the Java VM traps any access 
to the system’s resources. The request is funneled 
through the security manager. The security man- 
ager is responsible for checking if the access is autho- 
rized by the user. The < resource, AC'L > associa- 
tion is formed during the start up of the application 
(which is executing downloaded content) and hence, 
its basically an associative lookup for ACL during 


the runtime. The user is also given the flexibility 
to add new < resource, AC'L > entries at runtime. 
The configuration is then dynamically updated and 
so is the database. 


7 Trust Policy 


In this section, we will describe the semantics 
of the decision to grant/deny access based on the 
specification of ACLs and policy. We will first un- 
derstand the logic behind decisions taken by the se- 
curity manager. We will then provide the format 
specification of ACLs and policies. 


7.1 Access Granting Policy 


Each user maintains a local security database 
containing the trust policy information. It consists 
of 


e a database of principals (and keys) created us- 
ing the javakey utility 


e a specification of groups, formed by a set of 
principals 


e a set of access control lists, containing < 
principal, permissions > pairs 


e alist of < resource, AC'L > (policy) pairs defin- 
ing the trust relationship 


The specification of the ACLs and associating re- 
sources with ACLs together form the trust policy 
database. When a request for an access to a resource 
is submitted, the application consults (through our 
enhanced security manager) this database to make 
a decision based on the ACL guarding the resource. 

The Figure 2 depicts the decision flow in grant- 
ing permission for a downloaded content to access 
a resource. The default policy is to deny any ac- 
cess unless the user explicitly grants access. If the 
identity is given access or denied access through ex- 
plicit specification by the author, then the decision 
is based on that specification only. If there is no 
explicit individual specification of permission, the 
same check is carried out for all of the groups that 
the identity is a member of. Even if one of the 
groups is explicitly denied access, then the princi- 
pal is denied access by the security manager. If all 
the groups are explicitly given access, then the prin- 
cipal is granted the request to access the resource. If 
no explicit permission is specified either as an indi- 
vidual identity or as a member of any of the group, 
then by default the access is denied. The browser 
then dynamically queries the user if he would like 
to allow the principal who has signed the applet to 
access the resource. Depending on user’s input, the 
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- GroupName Principals Principal 
Type __|| 

syrResearch Nataraj, Doug, Paul Identities 
| diamondsTeam Gary, Doug, Nataraj Identities 

syrHosts cat.syr.edu, ece.syr.edu Hosts 

catHosts ratnam.cat.syr.edu,lynx.cat.syr.edu Hosts 

Table 1: A Group Table 
RESOURCES ACLs 


Ei 
a 


o- 


fileDirAcl 





=== 


display DiskAcl 


guards relation 


Figure 1: Sample Relation between Resources and ACLs 


database as well as the runtime are dynamically up- 
dated. It is also possible for the user to specify nega- 
tive permissions. This flexibility will be useful when 
specifying exceptions to group access permissions. 
For example, user might want to give access to all 
the members of a group except one member. In such 
a case, instead of forming a new group (a subgroup 
of original group), it is easier to grant permissions 
to the group and specify the member to be an ex- 
ception. All combinations of principals can thus be 
accommodated in permission specification using the 
flexible format. The specification format for ACL 
and policy are given below. 


7.2 ACL and Policy: Specification Format 


An ACL relates principal to permissions. We 
use the ACL framework in JDK1.1 [14]. The prin- 
cipal is the key field in the ACL database. Given a 
principal name, one can obtain all the permissions 
associated with the principal. The format is as fol- 
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lows: 


([+/-] {User |Group}.{Identity|Host}. 
<PrincipalName>=<set0fPermissions> 


where, 

the first field (optional) specifies whether the ACL 
specifies a granted permission or an exception. A 
— in that field indicates that it is an exception, 1.e. 
the principal of given PrincipalName is explicitly de- 
nied the specified setOfPermissions. A + in the first 
field (or even if nothing is specified in that optional 
field) indicates that the principal is given the spec- 
ified set of permissions. The key word User, in the 
second field, specifies that the principal name asso- 
ciated in this ACL is an individual principal whereas 
the keyword Group specifies that the principal name 
is actually the name of a group of principals. This 
resolves the PrincipalName specified in the fourth 
field to be either an individual principal or a group. 
The third field indicates that the specified principal 
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Figure 2: Decision flow for access control 


is an Identity (name of a person, company, team, 
etc) or a Host (hostname, domainname, etc). The 
setOfPermissions is a list of permissions separated 
by a ’,’. Following example is a valid ACL specifica- 
tion. 


+User .Identity.SyrUniv=FileRead, FileWrite 
+Group.Host.catHosts=FileRead, FileWrite 
-User.Host.ratnam.cat.syr.edu=FileWrite 


In the above ACL (say it is named acl), the 
principal SyrUniv has the FileRead and FileWrite 
permissions granted to it. All the host principals 
in syrHosts have the permission to FileRead and 
FileWrite except for the individual host cat.syr.edu 
(presumably a member of syrHosts group), which 
has been denied FileWrite permission. So effectively, 
the host cat.syr.edu has the permission FileRead 
through its membership in the syrHosts group but 
does not have the permission FileWrite even though 
all the other group members have the permission. 
For instance let the policy specification contain 


/hostA/users/nataraj/javaWork/*=acl1 


In this case, the ACL named acll guards the 
directory /hostA/users/nataraj/javaWork. So con- 
tents certified by SyrUniv can read and write to that 
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directory. Also all the syrHosts can read and write 
to that directory with the exception of the host, rat- 
nam.cat.syr.edu which cannot write to it. 


8 Implementation Status 


We have implemented our model for the Hot- 
Java browser. The browser allows selective access 
control of resources by the user. The user interface 
has been designed so that an end-user need not deal 
with the ACL or policy specification format details. 
The user can use the interface to specify the group 
and access configuration. The internal format and 
storage details are taken care of by our system. 

We have built a secu- 
rity manager class BrowserSecurityManager which 
subclasses sun.applet.AppletSecurity. An instance 
of the BrowserSecurityManager is created when the 
browser is initialized. This security manager object 
has an ACLManager object which acts as an inter- 
face to the ACLs and the policy database. Whenever 
an access is attempted, the sandbox security model 
funnels the request to our security manager object. 
This object consults the ACLManager object to see 
if the remote principal (responsible for the applet 
which originates the access request) has the neces- 
sary permission(s) to perform the operation. On re- 
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Principals | Principal || 
| Type 





SyrUniv at 
syrHosts Hosts 


|| ratnam.cat.syr.edu 


Identity || FileRead, FileWrite 


Permissions Permission 
Type 


Grant — | 


| FileRead, FileWrite | Grant | 


FileWrite | Deny 


Table 2: An ACL declaration 


ceiving a grant message from the ACLManager ob- 
ject, the access is permitted. A security exception is 
thrown, otherwise. 

The group and access configuration of the 
client system are read during initialization of the 
ACLManager object. With the assistance of 
ACLParser, the group specification and access speci- 
fication are read and the ACLManager object is pop- 
ulated with this information. 

We have provided 
implementations of the java.security.acl. Permission, 
java.security.acl.AclEntry and java.security.acl.Acl 
interfaces to serve our purpose. ‘The BrowserA- 
clImpl is responsible for both adding ACL entries 
(after reading the policy database) and for checking 
permissions during runtime. The class relationship 
is depicted in Figure 3 (for clarity, we have omitted 
method names from the class diagram). 

The interface to the policy database is pro- 
vided through the ACLManager object. In turn, the 
ACLParser object has the permission to read the 
policy database. The BrowserAclImpl object can 
add new entries at runtime (if the user wishes to 
add a trusted principal) and can update the policy 
database. Thus all these objects co-operate to con- 
trol an access to client’s resources. Currently, we 
have implemented our system to provide access con- 
trol for the HotJava browser based on who has cer- 
tified (signed) the applet and/or the source host of 
the applet (the host from where the applet is down- 
loaded). In future, we plan to extend the communi- 
cation through the Secure Socket Layer (SSL) and 
to provide flexible delegation mechanism, once the 
necessary delegation framework is in place. 


9 Conclusion 


We have presented a model to control the ac- 
cess to resources in an open distributed environment 
like the Internet. This model has been designed to 
provide advanced security features to user agents like 
browsers enabling them to selectively trust and grant 
access permissions to principals in such an open en- 
vironment. This flexibility is critical not only for 
development of applications for the open untrusted 
Internet but for any trusted Intranet. The need for 
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such a flexible security framework still exists. The 
public key cryptography techniques have been put 
to effective use in establishing authenticity over the 
network. Using our model, we have implemented a 
solution for the browsers to download contents over 
untrusted network and execute them in a trusted 
environment without damaging the client machines. 

We have modeled our system with the flexibil- 
ity of dealing with any kind of resource. Especially 
this will be useful with the current state of object- 
oriented technology where distributed objects coop- 
erate to achieve their goal. Systems based on intelli- 
gent agents or distributed objects providing different 
services are being built. These objects may com- 
municate either through remote method invocations 
like the RMI package [8] provides or by the another 
remote object mechanism proposed by us [9]. Given 
the Web’s heterogeneous nature and Java’s suitable 
positioning as an object-oriented platform for the 
Internet, extending our model to distributed objects 
can be easily done. 

In a distributed environment, rights of a prin- 
cipal may be delegated to other principals. Thus the 
access control model needs to be extended to accom- 
modate such delegated rights. As more distributed 
object-based systems are evolving and with speed in 
which Web-based applications are being deployed, 
the need for such a framework is necessary to pro- 
vide a secure environment for remote accesses. In 
the future, we plan to develop a practical delegation 
model for secure distributed computation over the 
Internet. 
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Abstract 


This paper describes the Service Configurator pattern, which 
decouples the implementation of services from the time when 
they are configured. This pattern increases the flexibility and 
extensibility of applications by enabling their constituent ser- 
vices to be configured at any point in time. The Service Con- 
figurator pattern is widely used in application environments 
(e.g., to configure Java applets into WWW browsers), operat- 
ing systems (e.g., to configure device drivers), and distributed 
systems (e.g., to configure standard Internet communication 
services). 


1 Introduction 


A rapidly growing collection of services is now available on 
the Internet. The term service has several generally accepted 
meanings: (1) a single capability offered by a server (such 
as the echo service provided by the inetd superserver), 
(2) a collection of capabilities offered by a server (such as 
the inetd superserver itself), and (3) a collection of servers 
that cooperate to achieve a common task (such as a collection 
of rwho daemons in a LAN that periodically broadcast and 
receive status reports on user and host activities). Unless 
otherwise indicated, this paper uses the first definition of 
service, i.e., an identifiable component in a server that offers 
a single capability to communicating entities. 

The range of services available on the Internet include: 
WWW browsing and content retrieval services, software dis- 
tribution services, electronic mail and network news trans- 
fer agents, file access on remote machines, remote terminal 
access, routing table management, host and user activity re- 
porting, network time protocols, and object request brokerage 
Services. 

A common way to implement these services is to develop 
each one as a Separate program and then compile, link, and 
execute each program in a Separate process. However, this 
“static” approach to configuring services yields inflexible, 
and often inefficient, applications and software architectures. 
The main problem with this static approach is that it tightly 
couples the implementation of a particular service with the 
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configuration of the service with respect to other services in 
an application or system. 

This paper describes the Service Configurator pattern, 
which increases application flexibility, and often perfor- 
mance, by decoupling the behavior of services from the point 
in time at which these service implementations are config- 
ured into an application or system. The examples in this 
paper illustrate the Service Configurator pattern using Java 
applets. However, the Service Configurator pattern has been 
implemented in many ways, ranging from device drivers in 
modern operating systems (like Solaris and Windows NT) 
to Internet superservers (like inetd and the Windows NT 
Service Control Manager). 

This paper is organized as follows: Section 2 describes 
the Service Configurator pattern using a variant of the GoF 
pattern format [1] and Section 3 presents concluding remarks. 


2 The Service Configurator Pattern 


2.1 Intent 


Decouples the behavior of services from the point in time at 
which service implementations are configured into an appli- 
cation or system. 


2.2 Also Known As 


Super-server 


2.3 Motivation 


The Service Configurator pattern decouples the implemen- 
tation of services from the time at which the services are 
configured into an application or a system. This decoupling 
improves modularity of the services and allows the services 
to evolve over time independently of configuration issues, 
such as whether or not two services must be co-located or 
what concurrency model will be used to execute the services. 

In addition, the Service Configurator pattern provides cen- 
tralized administration of all the services it configures. This 
facilitates automatic initialization and termination of the ser- 
vices and can optimize performance by performing common 
service initialization and termination activities. 
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Figure 1: A Distributed Time Service 


This section motivates the Service Configurator pattern 
using a distributed time service as an example. 


2.3.1 Context 


The Service Configurator pattern should be applied when 
a service needs to be initiated, suspended, resumed, and ter- 
minated dynamically. In addition, the Service Configurator 
pattern should be applied when service configuration deci- 
sions must be deferred until run-time. 

To illustrate this pattern, consider the distributed time ser- 
vice shown in Figure 1. This service provides accurate, fault- 
tolerant clock synchronization for computers collaborating in 
local area networks and wide area networks. A synchronized 
time service is important in distributed systems that require 
multiple hosts to maintain accurate global time. For instance, 
large-scale distributed medical imaging systems [2] require 
globally synchronized clocks to ensure that patient exams 
are accurately timestamped and analyzed expeditiously by 
radiologists throughout the health-care delivery system. 

As shown in Figure 1, the architecture of the distributed 
time service contains the following components: 


e Time Server — which answers queries about the time 
made by Clerks. 


e Clerk — which queries one or more Time Servers to 
determine the correct time, calculates the approximate 
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correct time using one of several distributed time algo- 
rithms (3, 4], and updates its own local system time. 


e Client — which uses the global time information main- 
tained by a Clerk to provide consistency with the notion 
of time used by clients on other hosts. 


2.3.2 Common Traps and Pitfalls 


One way to implement the distributed time service is to stat- 
ically configure the logical functionality of Time Servers, 
Clerks, and Clients into separate physical stand-alone pro- 
cesses. In this static approach, one or more hosts would 
run Time Server processes, which handle time update re- 
quests from Clerk processes. Each host that requires global 
time synchronization would run a Clerk process. The Clerks 
periodically update their local system time based on values 
received from one or more Time Servers. Client processes 
would then use the synchronized time reported by their local 
Clerk.? 

In addition to the time service, other services (such as file 
transfer, remote login, and HTTP servers) provided by the 
hosts would also execute in separate statically configured 
processes. 

However, implementing and configuring services in the 
static manner shown above has the following drawbacks: 


e Service configuration decisions must be made too early 
in the development cycle: This early binding is undesir- 
able since developers may not know a priori the best way 
to co-locate or distribute service components. For example, 
minimal memory resources in wireless computing environ- 
ments may force the split of Client and Clerk into two in- 
dependent processes running on separate hosts. In contrast, 
in a real-time avionics environment it might be necessary to 
co-locate the Clerk and the Time Server into one process to 
reduce communication latency. Forcing developers to com- 
mit prematurely to a particular service configuration impedes 
flexibility and can reduce performance and functionality. 


e Modif ying or terminating a service may adversely affect 
other services: In the static approach, the implementation 
of each service component is tightly coupled with its ini- 
tial configuration. This makes it hard to modify one service 
without affecting other services. For example, in the real- 
time avionics environment mentioned above, a Clerk and a 
Time Server might be statically configured to execute in one 
process to reduce latency. If the distributed time algorithm 
implemented by the Clerk is changed, the existing Clerk 
code would require modification, recompilation, and relink- 
ing. However, terminating the process to change the Clerk 
code would also terminate the Time Server. This disruption 
in service availability may not be acceptable for mission crit- 
ical distributed systems (such as telecommunication switches 


or call centers [5]). 


2For platforms that support shared memory, communication overhead 
can be minimized by storing the current time into shared memory that is 
mapped into the address space of the Clerk and al] Clients on the same host. 
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e System performance may not scale up efficiently: As- 
sociating a process for each service ties up valuable OS re- 
sources (such as I/O descriptors, virtual memory, and process 
table slots). This can be wasteful if services are frequently 
idle. Moreover, processes are often the wrong concurrency 
model for many short-lived communication tasks (such as 
asking a Time Server for the current time or resolving a host 
address request in the Domain Name Service). In these cases, 
a multi-threaded Active Object [6] or a single-threaded Re- 
active [7] event loop may be more efficient. 


2.3.3 Solution 


Often, a more convenient and flexible way to implement 
distributed services is to use the Service Configurator pattern. 
This pattern decouples the behavior of services fromthe point 
in time at which the service implementations are configured 
into an application or system. The Service Configurator 
pattern resolves the following forces: 


e The need to defer the selection of a particular type, 
or a particular implementation, of a service until very 
late in the design cycle: Dynamic configuration allows 
developers to concentrate on the functionality of a service, 
without committing themselves prematurely to a particular 
configuration of services. By decoupling functionality from 
configuration, the Service Configurator pattern permits appli- 
cations to evolve independently of the configuration policies 
and mechanisms used by the system. 


e The need to build complete applications or systems 
by composing multiple independently developed services 
that do not require global knowledge: The Service Con- 
figurator pattern requires all services to have a uniform inter- 
face for configuration and control. This allows the services to 
be treated as modular building blocks that can be integrated 
easily as components in a larger application. Enforcing a 
uniform interface for all services makes them “look and feel” 
the same with respect to how they are configured, thereby 
simplifying application development. 


e The need to optimize and control the behavior of a 
service at run-time: Decoupling implementation details 
of a service from configuration decisions makes it possible 
to fine-tune certain implementation or configuration param- 
eters of services. For instance, depending on the parallelism 
available on the hardware and operating system, it may be 
either more or less efficient to run one or more services in 
separate threads or processes. The Service Configurator pat- 
tern enables applications to select and tune these behaviors at 
run-time, when additional information (such as the number 
of CPUs or the OS version) is available to help optimize the 
services. In addition, adding a new or updated service to 
a distributed system may not require downtime for existing 
Services. 


Figure 2 uses OMT notation to illustrate the structure of 
the distributed time service designed according to the Service 
Configurator pattern. 


Service 
| -aecsamcy | Service Le ~ SCFviCes 


Repository 











WicuspendO) 
resume() 
info() 







Figure 2: Structure of a Distributed Time Service 





The Service base class provides a standard interface for 
configuring and controlling services (such as Time Servers or 
Clerks). A Service Configurator-based application uses this 
interface to initiate, suspend, resume, and terminate a service, 
as well as to obtain service-specific information (such as the 
service name, host address, and port number). Services re- 
side withinaService Repository and can be added to 
and removed from the Service Repository by Service 
Configurator-based applications. 

Two subclasses of the Service base class appear in 
the distributed time service: TimeServer and Clerk. 
Each subclass represents a concrete Service, which has 
specific functionality in the distributed time service. The 
TimeServer service is responsible for receiving and pro- 
cessing requests for time updates from Clerks. The Clerk 
service is a Connector [8] factory that performs the following 
tasks: 


1. Creates a new connection for every server; 


2. Dynamically allocates anew handler to send time update 
requests to a connected server; 


3. Receives the replies from all the servers through the 
handlers; 


4. Updates the local system time based on an average of 
all TimeServer responses. 


The Service Configurator pattern improves the flexibility 
of the distributed time service by managing the configura- 
tion of service components in the time service. Thus, con- 
figuration decisions (such as whether or not to co-locate the 
TimeServer and Clerks) are decoupled from implemen- 
tation details (such as the algorithm used by a Clerk to update 
its notion of time). In addition, implementations of the Ser- 
vice Configurator pattern can provide a framework that con- 
solidates the configuration and management of application 
services in one administrative unit. 


2.4 Applicability 


Use the Service Configurator pattern when: 
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e Services must be initiated, suspended, resumed, and 
terminated dynamically; and 


e An application or system can be simplified by being 
composed of multiple independently developed and dy- 
namically configurable services; or 


e The management of multiple services can be simplified 
or optimized by configuring them using a single admin- 
istrative unit. 


Do not use the Service Configurator pattern when: 


e Dynamic configuration is undesirable due to security 
restrictions (in this case, static configuration of trusted 
services may be necessary); or 


e The initialization or termination of a service is too com- 
plicated or too tightly coupled to its context to be per- 
formed in a uniform manner; or 


e Stringent performance requirements mandate the need 
to minimize the extra levels of indirection used by the 
the OS and language mechanisms for dynamic configu- 
ration. 


2.5 Structure and Participants 


The structure of the Service Configurator pattern is illustrated 
using OMT notation in Figure 3. 


Service 


. init() 
Service spervices_@l fini() 
Repository = sus pend() 


resume() 
info() 














Concrete Concrete Concrete 
Service A 1 | Service B Service C 


Figure 3: Structure of the Service Configurator Pattern 


The key participants in the Service Configurator pattern in- 
clude the following: 


e Service (Service) 


- Specifies the interface that contains the abstract 
hook methods [9] (such as methods for ini- 
tialization and termination) used by a Service 
Configurator-based application to dynamically 
configure each Service. 


e Concrete Service (Clerk and TimeServer) 


— Implements the service hook methods and other 
service-specific functionality (such as event pro- 
cessing and communication with clients and other 
services). 
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e Service Repository (ServiceRepository) 


— Maintains a repository of all services offered by 
a Service Configurator-based application. This 
allows an administrative entity to centrally manage 
and control the behavior of application services. 


2.6 Collaborations 


Service Service Service Service 
2 Configurator A B Repository 
S ; i | 
wy A 
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a E SERVICE DO 
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G 
5 RUN EVENT 
f & ‘LOOP 
a 
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Figure 4: Interaction Diagram for the Service Configurator 
Pattern 


Figure 4 depicts the collaborations in following three 
phases between components of the Service Configurator pat- 
tern: 


e Service configuration—The Service Configurator initial- 
izes a Service by calling its init method. Once the 
Service has been initialized successfully, the Service 
Configurator adds it to the ServiceRepository. 
The ServiceRepository is used by the Service 
Configurator to manage and control all Services that 
are installed. 


e Service processing — A Service is executed once it 
has been configured intothe system. Once a Service 
is executing, the Service Configurator can suspend and 
resume the Service. 


e Service termination — The Service Configurator termi- 
nates a Service once it is no longer needed. The 
Service Configurator calls the fini method on the 
Service to allow it to clean up before terminating. 
Once a Service is terminated, it is removed from the 
ServiceRepository.’ 


3Not all systems support service termination. For example, the Java run- 
time environment that implements the Service Configurator pattern provides 
no way to terminate an applet or unload it once it has been loaded into the 
run-time environment (e.g., a WWW browser). 
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2.7 Consequences 
2.7.1 Benefits 


The Service Configurator pattern offers the following bene- 
fits: 


e Centralized administration of services: The pattern 
consolidates one or more services into a single administrative 
unit. This simplifies development by automatically perform- 
ing common service initialization and termination activities 
(such as opening and closing files, acquiring and releasing 
locks, etc.). In addition, it centralizes the administration of 
services by enforcing a uniform set of configuration manage- 
ment operations on them (such as initialize, suspend, resume, 
and terminate). 


e Increased modularity and reuse: The pattern improves 
the modularity and reusability of services by decoupling the 
implementation of these services from the configuration of 
the services. In addition, all services have a uniform interface 
by which they are configured, thereby encouraging reuse and 
simplifying development of subsequent services. 


e Increased opportunity for tuning and optimization: 
The pattern increases the range of service configuration alter- 
natives available to developers by decoupling service func- 
tionality from the concurrency models (e.g., threads or pro- 
cesses) used to invoke the service. Developers can adaptively 
tune daemon concurrency levels to match client demands and 
available OS processing resources by choosing from a range 
of concurrency models. Some alternatives include spawn- 
ing a thread or process upon receipt of a client request or 
pre-spawning a thread or process at service creation time. 


2.7.2 Drawbacks 


The Service Configurator pattern has the following draw- 
backs: 


e Lack of determinism: The pattern makes it hard to de- 
termine the behavior of a service and/or application until 
run-time. This 1s particularly problematic for real-time sys- 
tems since a dynamically configured service may perform 
unpredictably when run with certain services. For exam- 
ple, if consumers in a real-time Event Service do not obey 
their periodic processing constraints, other real-time services 
will miss their deadlines and the system will not behave pre- 
dictably. 


e Reduced reliability: An application that uses the Ser- 
vice Configurator pattern may be less reliable than one that 
does not because a particular configuration of services may 
adversely affect the execution of the services. For instance, 
a faulty service may crash, thereby corrupting state informa- 
tion it shares with other co-located services. This is particu- 
larly problematic for open systems [10], such as Java applets 
within WWW browsers that configure and execute multiple 
services within the same process. 
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e Increased overhead: The pattern adds extra levels of 
indirection to execute a service. For instance, the Service 
Configurator first initializes the service and then loads it into 
the Service Repository. This may be undesirable or 
an unnecessary overhead in time-critical applications. In 
addition, the Service Configurator pattern often configures 
services via dynamic linking, which adds extra indirection to 
invoke functions and access global variables [11]. 


e Lack of generality: If services are tightly coupled, it 
may not be possible to dynamically configure them in ar- 
bitrary ways using the Service Configurator pattern. For 
example, it may be necessary to configure two services in a 
specific order or it may be necessary to always co-locate two 
services. The Service Configurator pattern only provides the 
mechanism of decoupling service implementation from ser- 
vice configuration — it does not dictate any policy by which 
services are to be configured. Therefore, the Service Config- 
urator is a building block in a “pattern language” of strategies 
for dynamically configuring and reconfiguring services. 


2.8 Implementation 


The Service Configurator pattern has been implemented in 
many contexts, ranging from device drivers in operating sys- 
tems like Solaris and Windows NT, Internet superservers like 
inetd, and Java applets in WWW browsers. This section ex- 
plains the steps and alternatives involved when implementing 
the pattern. These steps and alternatives are summarized in 
Table 1. 


e Define the service control interface: The following is 
the core interface that a service should support to enable the 
Service Configurator to configure and control the service: 


e Initialization — Provides an entry point into the service 
and performs initialization of the service; 


e Termination — Terminates execution of a service and 
provides a hook to cleanup application resources; 


e Suspension — Temporarily suspends the execution of a 
service; 


e Resumption — Resumes execution of a suspended ser- 
VICe; 

e Information — Obtains information about a service to 
determine its identity and behavioral characteristics. 


There are two ways to define the service control interface: 
inheritance-based and message-based, as described below: 


e Inheritance-based service control interface — In this 
approach, each service inherits from a common base 
class. This approach is used by the ACE Service 
Configurator framework [5] and Java applets, 
which defines abstract base classes that contain pure 
virtual “hook” methods. The following shows the 
Service interface similar to the one provided in ACE: 


Class Service 


{ 
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| Step Common Alternatives 


e Services inherit from an 
| abstract base class 
e Services respond to control messages 
e Maintain an in-memory table of service 
implementations 
e Maintain a persistent database of 
service implementations 


Select a configuration e Specify at command line 
mechanism e Specify through a configuration file 
i e Specify through a user interface 


Determine service execution | e Reactive execution 
mechanism e Multi-threaded Active Objects 
e Multi-process Active Objects 


Define the service control 
interface 


Define a Service Repository 


Table 1: Steps Involved in Implementing the Service Con- 
figurator Pattern 


public: 
// = Initialization and termination hooks. 


virtual int init (int argc, char *argv[]) = 0; 


virtual int fini (void) = 0; 


// = Scheduling hooks. 
virtual int suspend (void); 
virtual int resume (void) ; 


// = Informational hook. 

eee int info (char **, size_t) = 0; 

The init method is the entry point hook into a 
Service. It is used by the Service Configurator to 
initialize and execute a Service. The fini method 
is a hook that allows the Service Configurator to ter- 
minate the execution of a Service. The suspend 
and resume methods serve as scheduling hooks and 
are used by the Service Configurator to suspend and 
subsequently resume the execution of aService. The 
info method allows the Service Configurator to ob- 
tain Service-related information (such as its name 
and network address). Together, these methods impose 
a contract between the Service Configurator and the 
Service objects that it manages. 


e Message-based service control interface — Another way 
to control services is to program them to respond to a 
specific set of messages. This makes it possible to in- 
tegrate the Service Configurator into non-OO program- 
ming languages (such as C). The Windows NT Service 
Control Manager (SCM) [12] uses this scheme. Each 
Windows NT host has a master SCM process that au- 
tomatically initiates and manages system services by 
passing them control messages such as PAUSE, RESUME, 
and TERMINATE. Each developer of an SCM-managed 
service must write code to process these messages and 
perform the intended actions. 


e Define a Repository: A Service Repository 
maintains all the Service implementations in the form of 
objects, executable programs, and/or dynamically linked li- 
brary (DLLs). A Service Configurator uses the Service 
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Repository to access a service when itis configured into 
or removed from the system. In addition, the Repository 
maintains the current status of each service (e.g., whether a 
service is active or suspended). This information may reside 
in main memory, a file system, or the kernel. 


e Select a configuration mechanism: A service must be 
configured before it can execute. Configuring a service re- 
quires specifying attributes that indicate the location of the 
service’s implementation (such as an executable program or 
DLL), as well as the parameters required to initialize a ser- 
vice at run-time. This configuration information can be spec- 
ified in various ways ( é.g., on the command line, through 
a user interface, or through a configuration file). A central- 
ized configuration mechanism (such as the NT Registry or 
inetd. conf file) simplifies the installation and administra- 
tion of the services in an application by consolidating service 
attributes and initialization parameters in a single location. 


e Determine the service concurrency model: A service 
that has been dynamically configured by a Service Configura- 
tor can be executed using various combinations of Reactive 
[7] and Active Object [6] schemes. These alternatives are 
briefly outlined below: 


e Reactive execution - This approach uses a single thread 
of control to execute the Service Configurator and all 
the services it configures. 


e Multi-threaded Active Objects - This approach runs the 
dynamically configured services in their own threads of 
control within the Service Configurator process. The 
Service Configurator can either spawn new threads “on- 
demand” or execute the services within an existing pool 
of threads. 


e Multi-process Active Objects - This approach runs the 
dynamically configured services in their own processes. 
The Service Configurator can either spawn new pro- 
cesses “on-demand” or execute the services within an 
existing pool of processes. 


2.9 Sample Code 


The following code shows an example of the Service Con- 
figurator pattern in the context of Java applets. An applet is 
a Java class that can be loaded and run by a Java application 
(such as a Web browser, an applet viewer, or an application). 
The example below focuses on the configuration-related as- 
pects of the distributed time service described in Section 2.3. 
In addition, this example illustrates how other patterns (such 
as the Active Object pattern [6] and the Acceptor and Con- 
nector patterns [8]) are commonly used in conjunction with 
the Service Configurator pattern to develop flexible commu- 
nication infrastructure and services. 

In the example, the Concrete Service class in the 
OMT class diagram shown in Figure 3 is represented 
by the TimeServer class and the Clerk class. The 
Java code in this section implements the TimeServer 
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and the Clerk classes.* Both classes inherit from 


java.applet.Applet. This allows them to be down- 
loaded (e.g., from an HTTP server) and dynamically config- 
ured (e.g., into a Java interpreter withina WWW browser). 

The WWW server’s file system serves as the Ser- 
vice Repository for the Java applets. In addition, the 
java.applet.Applet class provides hook methods that 
allow dynamic (1) configuration of a service (init), (2) 
suspension of a service (Stop), and (3) resumption of a 
service (Start). Note that the java.applet.Applet 
class does not provide a termination method equivalent of 
fini described in Section 2.8. The Service Configurator 
pattern remains at the heart of the Java applets, however, 
by allowing their implementation to be decoupled from their 
dynamic configuration. 


2.9.1 The TimeServer Class 


The TimeServer uses the Acceptor class to accept con- 
nections from one or more Clerks. The Acceptor class 
uses the Acceptor pattern [8] to create handlers for each 
Clerk connection that wants to receive requests for time up- 
dates [13]. This design decouples the implementation of the 
TimeServer from its configuration. Therefore, developers 
can change the implementation of the TimeServer inde- 
pendently of its configuration. This design provides flex- 
ibility with respect to evolving the implementation of the 
TimeServer class. 

The TimeServer class inherits from the stan- 
dard java.applet.Applet class, which enables a 
TimeServer to be dynamically loaded into a running Java 
application. Once the TimeServer applet has been down- 
loaded and verified, the Java run-time system invokes its 
init hook. This method performs the Time Server- 
specific initialization code. 

The TimeServer class implements the Runnab1le in- 
terface. This allows it to become an active object and run 
in its own thread of control. Running TimeServer as an 
active object is useful if the applet’s main thread of control 
must perform other tasks (such as responding to user GUI 
events and methods called by the system). 


import jJava.applet.Applet; 


public class TimeServer extends Applet 
implements Runnable 

{ 
// Initialize the TimeServer when loaded. This 
// may include synchronizing server clock with 
// an atomic clock. This method corresponds 
// to the init() hook method of the Service 
// Configurator pattern. 


public void init () 
{7 Initialize, 
// (Re)start the TimeServer. 


// gets called after init() when the applet first 
// starts up in the context of Java run-time 


4To save space, most of the detailed Java code and exception handling 
code has been omitted. 
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Note that this method 


// system and also when the applet is restarted 
// after being temporarily stopped. The method 
// spawns off a new thread to handle Clerk 

// connections if a thread is not already running. 
// Otherwise it resumes the currently suspended 
// thread. 


public void start () 
{ 
if (serverThread_ == null) { 
serverThread_ = new Thread (this); 
serverThread_.start (); 


else 
// Resume the server thread. 
serverThread_.resume ()j; 
} 


// Temporarily stop/suspend the TimeServer. 

// This method suspends the thread that handles 
// Clerk connections. This method corresponds 
// to the suspend() hook method of the Service 
// Configurator pattern. 


public void stop () 
{ 
if (serverThread_ != null && 
serverThread_.isAlive ()) { 
// Suspend the server thread. 
serverThread_.suspend (); 


// Return information about the TimeServer 

// by overriding the method defined in the 

// java.applet.Applet class. This method 

// corresponds to the info() hook method of 

// the Service Configurator pattern. 

public String getAppletinfo () 

{ 
// Return a String containing information 
// about this applet. This may include the 
// name of the host, the version number, etc. 
return new String ( ... ); 

} 


// This method serves as the entry point for 
// the Time Server thread. It is called 

// when the thread starts. 

public void run () 


// Set the connection acceptor_ endpoint into 
// listen mode (using the Acceptor pattern). 
acceptor_.open (port_); 


// Now use the acceptor_ to accept 

// connections from Clerks. 

// 
) 
// Acceptor used for Clerk connections. 
protected Acceptor acceptor_; 


// Port the TimeServer listens on. 


private int port_ = SERVER_PORT_NUMBER; 
// The Server Thread 

private Thread serverThread_ = null; 

Eid 


The Java run-time system can suspend and resume the 
TimeServer by calling its stop and start hooks, re- 
spectively. In addition, it can call getAppletInfo method 
to obtain useful information about the service, such as the 
version number or the name of the author. 
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This method corresponds to the resume () 
// hook method of the Service Configurator pattern. 


2.9.2 The Clerk Class 


The Clerk uses the Connector pattern [8] to establish and 
maintain connections with one or more TimeServers. The 
Connector pattern creates a handler for every connection toa 
TimeServer. The handlers receive and process time updates 
from the TimeServers. 


The java.applet.Applet base class is the parent 


of the Clerk class. Therefore, like the TimeServer, a 
Clerk can be dynamically configured by the Java run-time 
system acting in the role of Service Configurator. The Java 
run-time system can initialize, suspend, resume, and obtain 
information about the Clerk by calling its init, stop, 
start, and getAppletInfo hooks, respectively. 


import java.applet.Applet; 


public class Clerk extends Applet 


{ 


implements Runnable 


// Initialize the Clerk when loaded. This 
// may include initializing the algorithm 
// implementation to be used to compute the 
// Clerk's notion of time. This method 

// corresponds to the init() hook method of 
// the Service Configurator pattern. 

public void init () 


// Initialize. 


} 


// (Re)start the Clerk. Note that this method 

// gets called after init() when the applet first 
// starts up in the context of Java run-time 

// system and also when the applet is restarted 
// after being temporarily stopped. The method 

// spawns off a new thread to setup connections 
// with the TimeServers if a thread is not already 
// running. Otherwise it resumes the currently 

// suspended thread. This method corresponds to 
// the resume() hook method of the Service 

// Configurator pattern. 


public void start () 
{ 
if (clerkThread_ == null) { 
CclerkThread_ = new Thread (this) ; 
clerkThread_.start (); 


else 
// Resume the Clerk thread. 
clerkThread_.resume (); 


) 


// Temporarily stop/suspend the Clerk. This 
// method suspends the thread that handles 
// connection to TimeServers. This method 
// corresponds to the suspend() hook method 
// of the Service Configurator pattern. 


public void stop () 
{ 
if (clerkThread_ != null && 
clerkThread_.isAlive ()) { 
// Suspend the Clerk thread. 
clerkThread_.suspend (); 
) 
} 


// Return information about the Clerk by 

// overriding the method defined in the 

// Java.applet.Applet class. This method 
// corresponds to the info() hook method of 
// the Service Configurator pattern. 

public String getAppletiInfo () 


// Return a String containing information about 
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// this applet. This may include the name of 
// the host, the version number, etc. 
return new String ( ... ); 


} 


// This method serves as the entry point for 

// the Clerk thread. It is called when the 

// thread starts. 

public void run () 

{ 
// Use the connector to set up connections 
// to all the TimeServers. Then use the 
// updateTime() method to send periodic requests 
// to the TimeServers for time updates, receive 
// the requests from the TimeServers, 
// the local notion of time. 
a 

} 


// Called periodically to compute the local 
// system time. 
protected void updateTime (long t) 


// Implement Clock Synchronization algorithm 
// here to compute local system time. 


} 


// Connect to TimeServers. 
protected Connector connector_; 


// The Clerk Thread. 


private Thread clerkThread_ = null; 


The Clerk periodically sends a request for time update to 
all its connected TimeServers. Once the Clerk receives 
responses from all its connected TimeServers, it recalcu- 
lates its notion of the local system time. Thus, when Clients 
ask a Clerk for the current time, they receive a globally syn- 
chronized value. 


2.9.3 Lifecycle of a Service 


Figure 5 shows a state diagram of the lifecycle of aService 
(such as the Clerk service). 


CONFIGURE/ 
init() 









RUNNING 







resume() suspend() 






SUSPENDED 





Figure 5: State Diagram of a Service Lifecycle in the Service 
Configurator Pattern 


Initially the service is idle. Depending upon requirements, 


the user can choose from various implementations dynami- 
cally, without having to focus on configuration issues. 
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and compute 


For instance, different Clerk services may exist corre- 
sponding to different algorithm implementations. Thus, the 
user may either select a Clerk service that implements the 
Berkeley algorithm [3] or a Clerk service that implements 
Cristian’s algorithm [4]. The choice may depend upon the 
characteristics of the TimeServer. If the machine on 
which the TimeServer resides has a WWV receiver? the 
TimeServer can act as a passive entity and Cristian algo- 
rithm would be best suited. On the other hand, if the machine 
on which the Time Server resides does not have a WWV 
receiver then an implementation of the Berkeley algorithm 
would be more appropriate. 

Once a Clerk service has been selected, it can be easily 
configured by loading it into the Java run-time environment 
(such as a Web browser, an applet viewer, or an application). 
The following HTML fragment shows how the Clerk applet 
can be loaded in an applet viewer or a Web browser: 
<APPLET code="Clerk.class"> 
<PARAM name=configFile value="svc.conf"> 


<PARAM name=pollTime value="10"> 
</APPLET> 


The APPLET tag specifies an applet to be run withina Web 
browser or an applet viewer. The PARAM tag specifies named 
parameters to be passed to the applet. An applet can look up 
the value of a parameter specified in a PARAM tag with the 
Applet .getParameter method. In the example above, 
the Clerk applet is passed the name ofa service configuration 
file and apollTime of 10 seconds. This configuration file 
(svc.conf) contains the hostnames and port numbers of all 
the Time Servers the Clerk will connect to. The pol1Time 
indicates how frequently the Clerk will poll the Time Servers. 

To reduce communication latency, The Clerk service can 
be co-located with a Time Server service. The following 
HTML fragment shows how the Clerk applet can be loaded 
in an applet viewer or a Web browser together with a co- 
located Time Server applet: 
<APPLET code="Clerk.class"> 
<PARAM name=configFile value="svc.conf"> 
<PARAM name=pollTime value="10"> 
</APPLET> 
<APPLET code="Server.class"> 


<PARAM name=port value="7734"> 
</APPLET> 


In this example, the Time Server class will listen at port 7734. 

Figure 6 shows the Clerk running independently as well as 
running co-located with a Time Server. This configuration 
decision need not affect the implementation of the various 
time services. Note, however, that if the Clerk and the Time 
Server are co-located in the same process, the Clerk may op- 
timize communication by (1) eliminating the need to set up a 
communication channel with the Server and (2) directly ac- 
cessing the Server’s local notion of time via shared memory. 
In general, the decoupling between a service implementation 
and its configuration exemplifies the flexibility offered by the 
Service Configurator pattern. 


5A WWV receiverintercepts the short pulses broadcasted by the National 
Institute of Standard Time (NIST) to provide Universal Coordinated Time 
(UTC) to the public. 
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Figure 6; (a) Clerk co-located with a Time Server; (b) Clerk 
running independently 


2.10 Known Uses 


The Service Configurator pattern has been used in a wide 
range of operating system and application programming en- 
vironments including Java applets, UNIX, Windows NT, and 
ACE: 


e Java applets: The applet mechanism in Java uses the 
Service Configurator pattern. Java supports dynamically 
downloading, initializing, starting, suspending and resum- 
ing applets. For instance, it defines methods (e.g., stop and 
start) that suspend and resume applet threads. A method 
in a Java applet can access the thread it is running in us- 
ing Thread.currentThread(). In addition, threads 
can control each other by invoking methods like stop and 
start. 


e Modern operating system device drivers: Modern op- 
erating systems (such as Solaris [14] and Windows NT 
[12]) support dynamically configurable kernel-level de- 
vice drivers. For instance, Solaris drivers can be linked 
into and unlinked out of the system dynamically via 
init/f£fini/info hooks. This makes it possible to recon- 
figure the operating system without having to shut it down, 
recompile and relink new drivers into the kernel, and then 
restart the system. 


e UNIX network daemon management: The Service 
Configurator pattern has been used in “superservers” that 
manage UNIX network daemons. Two widely available net- 
work daemon management frameworks are inetd [15] and 
listen [16]. Both frameworks consult configuration files 
that specify (1) service names (such as the standard Inter- 
net services Etp, telnet, daytime, and echo), (2) port 
numbers to listen on for clients to connect with these services, 
and (3) an executable file to invoke and perform the service 
when a client connects. These frameworks contain a master 
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Acceptor [8] process that reactively monitors the set of ports 
associated with the services. When a client connection oc- 
curs on a monitored port, the Acceptor process accepts the 
connection and demultiplexes the request to the appropriate 
pre-registered service handler. This handler performs the 
service (either reactively or in an active object) and returns 
any results to the client. 


e The Windows NT Service Control Manager (SCM): 
Unlike inetd and listen, the Windows NT Service Con- 
trol Manager (SCM) [12] is not a port monitor. That is, it does 
not provide built-in support for listening to a set of I/O ports 
and dispatching server processes “on-demand” when client 
requests arrive. Instead, it provides an RPC-based interface 
that allows a master SCM process to automatically initiate and 
control (i.e., pause, resume, terminate, etc.) administrator- 
installed services (suchas remote registry access). 
These services would otherwise run as separate threads within 
a single-service or a multi-service daemon process. Each 
installed service is individually responsible for configuring 
itself and monitoring any communication endpoints (which 
may be more general than I/O ports, e.g., named pipes or 
shared memory). 


e The ADAPTIVE Communication Environment (ACE) 
framework: The ACE framework [17] provides a set of 
C++ mechanisms for configuring and controlling services 
dynamically. The ACE Service Configurator ex- 
tends the mechanisms provided by inetd, listen, and 
SCM to automatically support dynamic linking and unlinking 
of services. The mechanisms provided by ACE were influ- 
enced by the interfaces used to configure and control device 
drivers in modern operating systems. Rather than targeting 
kernel-level device drivers, however, the ACE Service 
Configurator framework focuses on dynamic configu- 
ration and control of application-level Service objects. 


2.11 Related Patterns 


The intent of the Service Configurator pattern is similar to 
the Configuration pattern [18]. The Configuration pattern 
decouples structural issues related to configuring services 
in distributed applications from the execution of the ser- 
vices themselves. The Configuration pattern has been used 
in frameworks for configuring distributed systems (such as 
Regis [19] and Polylith [20]) to support the construction of 
a distributed system from a set of components. In a simi- 
lar way, the Service Configurator pattern decouples service 
initialization from service processing. The primary differ- 
ence is that the Configuration pattern focuses more on the 
active composition of a chain of related services, whereas 
the Service Configurator pattern focuses on the dynamic ini- 
tialization of service handlers at a particular endpoint. In 
addition, the Service Configurator pattern also focuses on 
decoupling service behavior from the service’s concurrency 
strategies. 

The Manager Pattern [21] manages a collection of objects 
by assuming responsibility for creating and deleting these 
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objects. In addition, it provides an interface to allow clients 
access to the objects it manages. The Service Configurator 
pattern can use the Manager pattern to create and delete Ser- 
vices as needed, as well as to maintain a repository of the 
Services it creates using the Manager Pattern . However, the 
functionality of dynamically configuring, initializing, sus- 
pending, resuming, and terminating a Service created using 
the Manager Pattern must be added to fully implement the 
Service Configurator Pattern. 

A Service Configurator often makes use of the Reactor 
[7] pattern to perform event demultiplexing and dispatching 
on behalf of configured services. Likewise, dynamically 
configured services that run for a long periods of time often 
execute using the Active Object pattern [22]. 

Administrative interfaces (such as configuration files or 
GUIs) to a Service Configurator-based system provide a Fa- 
cade [1]. This Facade simplifies the management and control 
of applications that are executing within the Service Config- 
urator. 

The virtual methods provided by the Service base class 
are callback “hooks” [23] or “hook methods” [9]. These 
hooks are used by the Service Configurator to initiate, sus- 
pend, resume, and terminate services. 

A Service (such as the Clerk class) may be created 
using a Factory Method [1]. This allows an application to 
decide the type of Service subclass to create. 


3 Concluding Remarks 


This paper describes the Service Configurator pattern and 
illustrates how it decouples the implementation of services 
from their configuration. This decoupling increases the flex- 
ibility and extensibility of services. In particular, service 
implementations can be developed and evolved over time in- 
dependently of many issues related to service configuration. 

The Service Configurator pattern also centralizes the ad- 
ministration of services it configures. This centralization can 
simplify programming effort by automating common service 
initialization tasks (such as opening and closing files, ac- 
quiring and releasing locks, etc). In addition, centralized 
administration can provide greater control over the lifecycle 
of services. 

The Service Configurator pattern has been applied widely 
in many contexts. This paper used Java applets to demon- 
strate the application of the Service Configurator pattern in 
the Java run-time system. The ability to decouple the de- 
velopment of Java applets from their configuration into the 
Java run-time system exemplifies the flexibility offered by 
the Service Configurator pattern. This decoupling allows 
different applets to be developed in accordance with differ- 
ent service implementations. The decision to configure a 
particular applet into the Java run-time system becomes a 
run-time decision, which yields greater flexibility. 

The Service Configurator pattern is also widely used in 
other contexts such as device drivers in Solaris and Win- 
dows NT, Internet superservers like inetd, the Windows NT 
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Service Control Manager, and the ACE framework. In each 
case, the Service Configurator pattern decouples the imple- 
mentation of a service from the configuration of the service. 
This decoupling supports both extensibility and flexibility of 
applications, 


4 Availability 


The ADAPTIVE Communication Environment (ACE) pro- 
vides an implementation of the Service Configurator 
pattern. ACE is freely available via the WWW at 
www.cs.wustl.edu/~schmidt/ACE.html. 
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Abstract 


Reliable distributed systems involve many complex 
protocols. In this context, protocol composition 1s 
a central concept, because it allows the reuse of 
robust protocol implementations. In this paper, we 
describe how the Strategy pattern has been recur- 
sively used to support protocol composition 1n the 
BAST framework. We also discuss design alterna- 
tives that have been applied in other existing frame- 
works. 


1 Introduction 


This paper presents how the Strategy pattern has 
been used to build BAST!, an extensible object- 
oriented framework for programming reliable dis- 
tributed systems. Protocol composition playsacen- 
tral role in BAST and relies on the notion of protocol 
class. In this paper, we focus on the recursive use of 
the Strategy pattern to overcome the limitations of 
inheritance, when trying to flexibly compose proto- 
cols. In a companion paper [6], we have presented 
how generic agreement protocol classes can be cus- 
tomized to solve atomic commitment [10] and total 
order multicast [20], which are central problems 
in transactional systems and to group-oriented sys- 


*Partially supported by OFES under contract num- 
ber 95.0830, as part of the ESPRIT BROADCAST-WG (num- 
ber 22455) 

'We named BAST after the cat-goddess of the Egyptian 
mythology: cats are known to survive several “crashes”. 


tems respectively. In [7], we also show how BAST 
allows distributed applications to be made fault- 
tolerant, by application programmers who are not 
necessarily skilled in reliability issues. 


The BAST Framework 


Building reliable distributed systems is a chal- 
lenging task, as one has to deal with many com- 
plex issues, e.g., reliable communications, failure 
detections?, distributed consensus, replication man- 
agement, transactions management, etc. Each of 
these issues corresponds to some distributed proto- 
col and there are many. In such a protocol “jun- 
gle”, programmers have to choose the right proto- 
col for the right need. Besides, when more than 
one protocol is necessary, the problem of their in- 
teractions arises, which further complicates pro- 
grammers’ task. The BAST framework aims at 
structuring reliable distributed systems by allow- 
ing complex distributed protocols to be composed 
in a flexible manner. For example, by adequately 
composing reliable multicast protocols with trans- 
actional protocols, BAST makes it possible to trans- 
parently support transactions on groups of repli- 
cated objects. It relies heavily on the Strategy pat- 
tern, which is recursively used to get around the 
limitations of inheritance as far as protocol compo- 
sition goes. Our first prototype is written in Small- 
talk [8] and 1s fully operational. Itis currently being 


*A failure detector is a high-level abstraction that hides the 
timeouts commonly used in distributed systems [2]. 
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(b) protocol class hierarchy 


Figure 1: Protocols and Protocol Classes in BAST 


used for teaching reliable distributed systems and 
for prototyping new fault-tolerant distributed pro- 
tocols. Adding more and more protocol classes will 
help us to further test our approach. BAST has also 
been recently ported to the Java [9] programming 
environment. Performance is not yet good enough 
for practical application development, but we are 
currently working on performance evaluations and 
code optimization [7]. 


Overview of the Paper 


Section 2 introduces the concept of protocol object 
as defined in BAST, and howit helps to structure dis- 
tributed systems and to deal with failures. Section 3 
discusses why inheritance alone is limited in sup- 
porting flexible protocol composition and presents 
how we applied the Strategy pattern to break these 
limitations. We also show how the Strategy pattern 
is transparently used in a recursive manner, and 
we present what steps have to be performed in or- 
der to extend BAST through protocol composition. 
Section 4 discusses various design alternatives, and 
compares our approach with other research works 
described in the literature. Finally, Section 5 sum- 
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marizes the contribution of this paper, as well as 
some future developments in the BAST framework. 


2 Protocol Objects 


The BAST framework was designed to help pro- 
grammers in building reliable distributed systems, 
and is based on protocols as basic structuring com- 
ponents. With BAST, a distributed system is com- 
posed of protocol objects that have the ability to 
remotely designate each other and to participate in 
various protocols. A distributed protocol 7 is a set 
of interactions between protocol objects that aim at 
solving distributed problemm. Weuse aTObj ect 
to name a protocol object capable of participating 
in protocol 7, and we say that 70bject is its 
protocol class. Each 70bj ect provides a set of 
operations that implement interface protocol 7, 1.e., 
these operations act as entry points to the protocol. 
Abstract class Protobject is the root of the pro- 
tocol class hierarchy. 

With such broad definitions, any interaction be- 
tween objects located on distinct network nodes is 
a distributed protocol, even a mere point-to-point 
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communication. For example, class RMPObject 
implements a reliable point-to-point communica- 
tion protocol and provides operations rSend ( ) 

and rDeliver() that enable to reliable sending 
and receiving, respectively, of any object? ‘ callback 
operation rDeliver () is redefinable and is said 
to be triggered by the protocol. Note that such a ho- 
mogeneous view of what distributed protocols are 
does not contradict the fact that some protocols are 
more basic than others. Communication protocols, 
for example, are fundamental to almost any other 
distributed protocol. 


Dealing with Failures. Because failures are part 
of the real world, there is the need for reliable dis- 
tributed protocols, e.g., consensus, atomic commit- 
ment, total order multicast. Reliable distributed 
protocols are challenging to implement because 
they imply complex relationships with other un- 
derlying protocols. For example, both the atomic 
commitment and the total order multicast rely on 
consensus, while the latter is itself based on failure 
detections, on reliable point-to-point communica- 
tions, and on reliable multicasts. In turn, reliable 
multicasts can be built on top of reliable point-to- 
point communications. Figure 1 (a) presents an 
overview of some distributed protocol dependen- 
cies. 


In BAST, protocol classes are organized into a 
single inheritance hierarchy which follows protocol 
dependencies, as pictured in Figure 1 (b). Each 
protocol class implements only one protocol, but 
instances of some 70bject class can execute any 
protocol inherited from 770bject’s superclasses. 
Protocol objects are able to run several executions 
of identical and/or distinct protocols concurrently. 


3We mean here any object that is not a protocol object. 
Allowing the sending of protocol objects across the network 
implies the solving of the distributed object migration problem. 
We did not address this issue in our framework yet. 


3 Strategy Pattern in BAST 


Composing Protocols 


With protocol objects, managing protocol depen- 
dencies is not only possible during the design and 
implementation phases (between protocol classes), 
butalso at runtime (between protocol objects). This 
is partly due to the fact that protocol objects can ex- 
ecute more than one protocol at a time. In this 
context, trying to compose protocols comes down 
to answering the question “How are protocol layers 
assembled and how do they cooperate?”’. 

Figure 2 (a) presents a runtime snapshot of 
aCSSObject, some protocol object of class 
CSSObject that implements an algorithm for 
solving the distributed consensus problem. The 
consensus problem is defined on some set a of dis- 
tributed objects as follows: all correct objects in 
@ propose an initial value and must reach agree- 
ment on one of the proposed values (the deci- 
sion) [3]. Class CSSObject defines operations 
propose() anddecide(), which mark the be- 
ginning and the termination of the protocol re- 
spectively [2]. Besides consensus, protocol ob- 
ject aCSSObject is also capable of executing 
any protocol inherited by its class, e.g., reliable 
point-to-point communications and reliable multi- 
casts, as well as failure detections. In Figure 2 (a), 
aCSSObject is concurrently managing five dif- 
ferent protocol stacks for the application layer, and 
issuing low-level calls to the transport layer. Focus- 
ing on the consensus stack, protocol composition 
means here to assemble various layers, each being 
necessary to execute the consensus protocol, into 
the protocol stack pictured in Figure 2 (b). The 
assembling occurs at runtime and creates a new 
stack each time the application invokes operation 
propose(). 


Inadequacy of Inheritance Alone. With BAST, 
distributed applications are structured according to 
their needs in protocols: they are made of protocol 
objects, which act as distributed entities capable of 
executing various protocols. With this approach, it 
all comes down to choosing the right class for the 
right problem. We believe that inheritance is an 
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(b) interacting protocol layers 


Figure 2: Protocol Layers and Protocol Objects 


appropriate tool to achieve this: by passing appro- 
priate arguments to protocol operations and by im- 
plementing callback operations, programmers have 
the ability to tailor generic protocol classes to their 
needs. However, we claim that inheritance alone is 
not sufficient as far as protocol composition goes, 
because it does not offer enough flexibility. For 
example, inheritance does not allow for the easy 
implementation of a new algorithm for some ex- 
isting protocol, and then to use it in various proto- 
col classes that are scattered in the class hierarchy. 
Furthermore, inheritance is not appropriate when 
it comes to choosing among several protocol algo- 
rithms at runtime. These limitations lead us to seek 
an alternative solution for flexible protocol compo- 
sition. 


Protocol Algorithms as Strategies 


According to Gamma et al., the intent of the Strat- 
egy pattern is to “define a family of algorithms, 
encapsulate each one, and make them interchange- 
able” [5, page 315]. This is usually achieved by 
objectifying the algorithm [4], 1.e., by encapsulat- 
ing it into a so-called strategy object; the latter 1s 
then used by a so-called context object. Making 
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each 70bject protocol class independent of the 
algorithm supporting protocol 7r is precisely what 
we need to be able to compose reliable distributed 
protocols in a flexible manner. 

In the BAST framework, strategy objects repre- 
sent protocol algorithms and they are instances of 
subclasses of class ProtoAlgo. A ProtoAlgo 
subclass that implements an algorithm for solving 
problem 7 is referred to as class 7 Algo. In the 
Strategy pattern terminology, a protocol algorithm, 
instance of some 7TAlgo class, is a strategy, anda 
protocol object, instance of some 70bject class, 
is acontext. A strategy and its context are strongly 
coupled and the application layer only deals with 
instances of 77Obj ect classes, i.e., it knows noth- 
ing about strategies. 


Strategy/Context Interactions. Figure 3 (a) 
sketches the way protocol objects and algorithm 
objects interact. On the left side, protocol object 
a7TObject offers the services it inherits from its 
superclasses, as well as the new services that are 
specific to protocol 77. The actual algorithm im- 
plementing protocol 7 is not part of aTObject’s 
code; instead, the latter uses services provided by 
strategy a7 Algo (ontheright side of Figure 3 (a)). 
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Figure 3: Strategy Pattern in BAST 


Whenever an operation related to protocol 77 is in- 
voked on a7Object, the execution of the pro- 
tocol is delegated to strategy a7 Algo. In tum, 
the services required by the strategy to run proto- 
col 7 are based on the inherited services of context 
a7’ Object. Such required services merely iden- 
tify entry point operations to underlying protocols 
needed to solve problem 77. 

Each instance of class 7 Algo represents one 
execution of protocol 7 implemented by that class, 
and holds a reference to the context object for 
which it is running; any call to the services re- 
quired by the strategy will be issued to its context 
object. There might be more than one instance of 
the same ProtoAlgo’s subclass used simultane- 
ously by a7 Object. At runtime, the latter main- 
tains a table of all strategies that are currently in 
execution for it. Each message is tagged to enable 
a7’Object to identify in which execution of what 
protocol that message is involved, and to dispatch it 
to the right strategy. Figure 3 (b) presents the rela- 
tionship between classes 70bject and 7TAlgo, 
using a class diagram based on the Object Model- 
ing Technique notation [19]. The correspondence 
between 77Algo strategy objects and layered pro- 
tocol stacks is pictured in Figure 3 (c): at runtime, 
each strategy object represents a layer in one of the 
protocol stacks currently in execution. 


Consequences. The context/strategy separation 
enables the limitations of inheritance to be over- 
come, as far as protocol composition goes. One 
could for example optimize the reliable multicast 
algorithm and use it in some protocol classes, while 
leaving it unchanged in others. Protocol algorithms 
could even be dynamically edited and/or chosen, 
according to criteria computed at runtime; this fea- 
ture is analogous to the dynamic interpositioning 
of objects. There is a minor compatibility con- 
straint among different protocol algorithms in order 
to make them interchangeable: new algorithm class 
7Algo, can replace default 7 Algo in protocol 
class TObject ifand only if TAlgo, requires a 
subset of the services featured by 70bject. 


This approach also helps protocol programmers 
to clearly specify, for each protocol 7, its depen- 
dencies with other protocols. One drawback of the 
Strategy pattern is the overhead due to local interac- 
tions between strategies and contexts. In distributed 
systems however, this overhead is small compared 
to communication delays, especially when fail- 
ures and/or complex protocols are involved. More 
specifically, the time for a local Smalltalk invo- 
cation is normally under 100 ps, whereas a reli- 
able multicast communication usually takes more 
than 100 ms when three or more protocol objects 
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Figure 4: Recursive Use of the Strategy Pattern 


are involved* (without even considering failures). 
The gain in flexibility clearly overtakes the local 
overhead caused by the use of the Strategy pattern. 


Reliable Multicast: an Example 


We now present how we implemented reliable mut- 
ticast communications using the Strategy pattern. 
In BAST, class RMCObject provides primitives 
rmcast() and rmDeliver() that enable the 
sending and receiving, respectively, of a message m 
to a set of protocol objects referenced in dest Set, 
in a way that enforces reliable multicast properties. 
The current implementation of class RMCObject 
relies on strategy class RMCAlgo. 


Overview of the Protocol). The protocol starts 
when operation rmcast() is invoked on some 
initiator object aRMCObject;, passing it a mes- 
sage m and a destination set destSet. In this 


; 4On a 10 Mbits Ethermet connecting Sun SPARCstations 20. 
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operation, context aRMCObject; first creates a 
strategy aRMCAlgo,;, and then invokes operation 
rmcast() on it, with the arguments it just re- 
ceived. Strategy aRMCAlgo; builds message ii, 
containing both m and destSet. It then issues 
a reliable point-to-point communication with each 
protocol object referenced in destSet; in order 
to do this, strategy aRMCAlgo; relies on inher- 
ited service rSend() of context aRMCObject;,. 
When message m reaches aRMCObject;, one of 
the target objects, operation rDeliver () is trig- 
gered by the protocol. Operation rDeliver ( ) 
detects that m 1s a multicast message and forwards 
it to ARMCAlgo,, the strategy in charge of that 
particular execution of the reliable multicast pro- 
tocol. When aRMCAlgo; receives m for the first 
time, it re-issues a reliable point-to-point commu- 
nication with each protocol object referenced in 
destSet (extracted from m), and then invokes 
rmDeliver() on its context aRMCObject;, 
passing it message m (also extracted from m). This 
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retransmission scheme 1s necessary because of the 
agreement property of the reliable multicast prim- 
itive, which requires that either all correct objects 
in destSet or none receive message m [2]. 


Recursive Use of the Strategy Pattern 


When solving distributed problem 7, one can 
strictly focus on the interaction between class 
7Object and class 7Algo, while forgetting 
about how other protocols are implemented. In 
particular, all protocols needed to support proto- 
col 7 are transparently used through inherited ser- 
vices of class 70bject. Those services might 
also be implemented applying the Strategy pattem, 
but this is transparently managed by inherited oper- 
ations of 7TObject. In that sense, BAST uses the 
Strategy pattern in a powerful recursive manner. 

The recursive use of the Strategy pattern is il- 
lustrated in Figure 4. The latter schematically 
presents a possible implementation of protocol 
class CSSObject presented in Section 3, which 
enables the solution of the distributed consensus 
problem by providing operations propose () and 
decide(). In Figure 4, the gray oval is context 
class CSSObject, while inner white circles are 
various 77 Algo strategy classes (7 being different 
protocols). Arrows show the connections between 
provided services (top) and required services (bot- 
tom) of each strategy class. Operations provided 
by class CSSObject are grouped on the applica- 
tion layer side (top). Each strategy class pictured in 
Figure 4 is managed by the corresponding context 
class in the protocol class hierarchy presented in 
Figure 1 (b). 


Extending the BAST Framework 


Basing the BAST framework on the Strategy pattern 
has the advantage of making it easily extensible. 
To illustrate this, we now present how we built 
DTMObject, a protocol class supporting the Dy- 
namic Terminating Multicast (DTM) protocol [11] 
from existing contexts and strategies. The DIM 
protocol can be understood as a common denomi- 
nator of many reliable distributed algorithms [12]. 


Overview of the Protocol. The protocol starts 
by the invocation of operation dtmcast() on 
an initiator object, passing it a message m and 
a set of protocol object references destSet. 
This invocation results in a reliable multicast of 
m to the set of participants objects. When mes- 
sage m reaches some participant, the protocol 
triggers operation dtmReceive(), passing m 
as argument. The participant object then com- 
putes a reply and returns it. Eventually, opera- 
tion dtmInterpret () 1s triggered by the pro- 
tocol on each non-faulty participant object, taking 
replySet, a subset of the participants’ replies, 
as argument. The protocol insures that all correct 
participant objects get the same subset of replies, 
1.e., a consensus has been reached on that set. 


Methodology for Extending BAST. A five steps 
methodology guides programmers in extending the 
BAST framework using the Strategy pattern. We 
illustrate each of these steps below, by presenting 
how the methodology was applied to the design 
of class DTMObject. Figure 5 summarizes the 
methodology. 


1. Establish what services the new protocol 
class DT[MObject provides, 1.e., what op- 
erations are given to programmers want- 
ing to use DTMObject; those opera- 
tions are dtmcast (), dtmReceive() and 
datmInterpret(). 


2. Choose an algorithm implementing DTM and 
determine what services it requires, by decom- 
posing it in a way that allows to reuse as many 
existing protocols as possible; those services 
are: consensus, failure detections, as well as 
reliable point-to-point and reliable multicast 
communications (see [11] for algorithmic de- 
tails). 


3. Implement the chosen algorithm in some 
DTMA1go class; all calls to the above required 
services are issued to an instance variable rep- 
resenting the context object, 1.e., an instance 
of class DTMObject. 
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Establish the DTM, specification of 
protocol DTM. (i.e., provided services). 


startMonitor() 
te ayaa eiay 
doSuspect () 
doNotSuspect () 
isSuspecting 


Determine the services required by 
the chosen algorithm for DTM, . 


Implement that algorithm in a 
subclass of class ProtoAlgo. 
(Let’s name it DTMA1go). 


Choose a subclass of class 
Protobject offering at least 
all services defined by step 2. 
(Let’s take class CSSObject). 


Subclass CSSObj ect into class 
DTMObject. The implementation 
merely “connects” provided and required 
services of DTMA1go, to specific 
and inherited services of DTMObject 
respectively. 
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Figure 5: Extending BAST with Protocol Class DTMObj ect 


4. Choose the protocol class that will be derived 
to obtain new class DTMObject; the choice 
of class CSSObj ect 1s directly inferred from 
step 2, since the chosen superclass has to pro- 
vide at least all the services required by pro- 
tocol DTM. 


5. Implement class DTMObject by connecting 
services provided by class DIMAlgo to new 
DTM-specific services of class DTMObject, 
and by connecting services required by class 
DTMA1go to corresponding inherited services 
of class DTMObject. 


4 Design Alternatives 


Our first implementation of BAST was not based 
on the Strategy pattern, 1.e., distributed algorithms 
were not objects, and protocol objects were not 
capable of participating in more than one proto- 
col execution concurrently. Furthermore, proto- 
col composition was only possible through single 
inheritance’. 


>Remember that we used Smalltalk as implementation lan- 
guage for prototyping. 
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Because protocol objects are the basic address- 
able distributed entities in our approach, it is not 
possible to guarantee that there will never be more 
than one protocol execution involving each proto- 
col object at a given time. For example, we cannot 
make sure that there will not be two concurrent mul- 
ticast communications and/or transactions involv- 
ing the same protocol objects. Allowing concur- 
rency at this level is an essential feature. Moreover, 
as far as protocol composition is concermed, single 
inheritance is inadequate for offering a satisfactory 
degree of flexibility. 

For all these reasons, we made BAST evolve 
througha second implementation of which the main 
goal was to overcome the limitations mentioned 
above. We now discuss some design alternatives 
that were considered in the process of implement- 
ing this second prototype of BAST, together with 
design issues that we studied from other existing 
frameworks described in the literature. 


Multiple Inheritance and Mixins 


Although our prototyping language does not offer 
multiple inheritance, assembling the various pro- 
tocol layers through this code reuse mechanism is 
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very appealing®. The idea is to make each pro- 
tocol class 70bj ect implement only protocol 7, 
while accessing all required underlying protocols 
through unimplemented operations; each protocol 
class is then an abstract class and we usually say 
it is a mixin class or simply a mixin. Before being 
able to actually instantiate a protocol object, one 
first has to build a new class deriving from all the 
necessary mixins. 

There are three major drawbacks with this ap- 
proach. First, protocol classes are not more ready- 
to-use components: a fairly complex multiple sub- 
classing phase is now required. As consequence, 
programmers have to deal with protocol relation- 
ships “manually”. Second, protocol layers can only 
be assembled through subclassing, and it is thus 
difficult if not impossible to compose protocol at 
runtime: in several programming languages, e.g., 
C++, classes are only compile-time entities. Third, 
we still have to manage concurrent protocol exe- 
cutions within the same protocol object, while this 
problem is handled nicely as soon as algorithms are 
manipulated as objects. 


Toolbox Approach 


Another possible approach to the reuse of protocol 
implementations is to provide programmers with a 
toolbox containing reusable components and asso- 
ciate them with design patterns. Both ASx [21] and 
CONDUITS+ [13] frameworks can be seen as such 
toolboxes. The ASx framework provides collabo- 
rating C++ components, also known as wrappers, 
that help in producing reusable communication 
infrastructures. These components are designed 
to perform common communication-related tasks, 
e.g., event demultiplexing, event handler dispatch- 
ing, connection establishment, routing, etc. Several 
design patterns, such as the Reactor pattern and 
the Acceptor pattem, act as architectural blueprints 
that guide programmers in producing reusable and 
portable code. In CONDUITS+ [13], two kinds of 
objects are basically offered: conduits and infor- 


®Ingalls and Boming have shown how reflective facilities of 
Smalltalk can be applied to extend the language with multiple 
inheritance [14], so we could have used that technique if we 
really wanted to. 


mation chunks, which can be assembled in order to 
create protocol layers and protocol stacks. Various 
patterns are also provided to help programmers in 
building protocols. 

However, there is no such thing as protocol ob- 
ject in either of the above frameworks. Since our 
main intent is to provide programmers with a pow- 
erful unifying concept, the protocol object, we did 
not choose a toolbox approach for BAST. Further- 
more, ASX does not promote protocol composition, 
whereas CONDUITS+ does it in a slightly different 
way than BAST, as we discuss below. 


Black-box Framework 


CONDUITS+ offers basic elements that helps pro- 
grammers build protocol layers. The use of design 
patterns is motivated by the fact that traditional lay- 
ered architectures do not allow code reuse across 
layers, which is precisely what CONDUITS+ aims at. 
Protocols can then be composed with CONDUIT+, 
at lower-level than BAST, through the assembling 
of conduits and information chunks, which are el- 
ementary blocks used to build protocol layers. In 
other words, the CONDUIT+ framework does not 
allow the manipulation of protocol layers as ob- 
jects, but only the manipulation of pieces of proto- 
col layers. Compared to BAST, protocol algorithms 
are further decomposed in CONDUIT+: conduits 
and information chunks are finer grain objects than 
BAST’s strategies. Indeed, strategies represent pro- 
tocol layers, while conduits and information chunks 
are internal components of protocol layers. CON- 
DUIT+ goes one step further in the process of ob- 
jectifying protocol algorithms. 

This approach makes it easy for CONDUIT+ to 
be a pure black-box framework, while BAST com- 
bines features of both black-box and white-box 
frameworks’. With BAST, we are considering com- 
pletely getting rid of inheritance but this issue has 
to be carefully studied, because it would have im- 
portant consequences on the way BAST can be used 
by application programmers, 1.e., those who have 


"In a black-box framework, reusability is mainly achieved 
by assembling instances, whereas in a white-box framework, 
it is mainly achieved through inheritance. A black-box frame- 
work is easier to use, but harder to design. 
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very limited skills in fault-tolerant distributed algo- 
rithms. 


Modeling Communications 


Several systems model communications but do not 
really address reliability issues, e.g., STREAMS [18] 
and the x-Kemel [17]. AVOCA [24] defines the no- 
tion of protocol objects, but not in the sense that 
BAST does; furthermore, it mainly applies to high- 
performance communication subsystems. Other 
systems offer reliable distributed communications, 
either based on groups as elemental addressing fa- 
cilities, e.g., CONSUL [15], ISIS [1] and Horus [23], 
or based on transactions, e.g., ARJUNA [22]. 


Microprotocols and the x-Kernel 


The work done by O’Malley and Peterson [16] 
is the closest to BAST that we could find. They 
extended the x-Kernel with the notion of micro- 
protocol graph, and they described a methodology 
for organizing network software into a complex 
graph, where each microprotocol encapsulates a 
single function. In contrast, conventional ISO and 
TCP/IP protocol stacks have much simpler proto- 
col graphs, with each layer encapsulating several 
related protocol functions. They argue that such a 
fine-grain decomposition allows for better tailoring 
of communication protocols to application needs; 
ourconclusion concurs with theirs perfectly on that 
point. In their paper, O’ Malley and Peterson mainly 
apply their approach to RPC communications (with 
only one very short discussion of what they call a 
fault-tolerant multicast), Compared to BAST, their 
approach is very close to what we have done and 
is based on the same basic assumption: compos- 
ing (micro-)protocols is essential when it comes to 
customizing complex distributed applications (and 
fault-tolerance implies such complexity). In their 
terminology, what we call problem 7 is referred to 
as metaprotocol 7. 

There are also some important differences, how- 
ever. They do not provide ready-to-use protocol 
classes to application programmers who are not 
skilled at understanding and/or building complex 
protocol graphs, whereas this is one of the main 


goals of BAST [7]. Moreover, their approach does 
not rely on design patterns. Similarly to CONDUIT+, 
they go one step further in their decomposition of 
protocol algorithms, by defining the notion of vir- 
tual protocols. The latter “are not truly protocols in 
the traditional sense” [16, page 131] : virtual pro- 
tocols are actually used to remove IF-statements 
and to place them in the microprotocol graph in- 
stead. All those differences can be best understood 
by looking at the background domains of the BAST 
library and the x-Kernel respectively. The latter 
aims at helping system programmers to customize 
any communication protocol usually found in mod- 
em operating systems, while the former aims at 
providing ready-to-use protocol classes, in order 
to help any programmer to build fault-tolerant ap- 
plications, and at allowing skilled programmers to 
build news fault-tolerant protocols easily. 


Composing Protocol Stacks in HORUS 


As far as protocol composition is concerned, the 
HORUS system enables the building of protocol 
stacks from existing layers only in a strictly ver- 
tical manner. Furthermore, it is based on groups 
as fundamental addressing and communication fa- 
cility, and provides no framework and/or pattern 
for building new protocols layers. HORUS merely 
provides a finite set of ready-to-use protocol lay- 
ers, which can only be composed around the group 
membership protocol. 


With BAST, we have tried to model any kind 
of interaction between distributed objects, not only 
group communications. This is essential in order to 
deal with failures in an extensible way, because re- 
liable protocols tend to be much more complex than 
normal communications. By making protocol ob- 
jects BAST’s basic distributed entities, we can build 
both the group model and the transaction model [6]. 
Furthermore, the Strategy pattern provides a pow- 
erful scheme for creating new protocols through 
composition. 


Conference on Object-Oriented Technologies and Systems - June 16-20, 1997 USENIX Association 


USENIX Association 


5 Concluding Remarks 


In this paper, we presented how protocol objects 
can help in building reliable distributed systems. 
We focused on how the Strategy pattern allows the 
limitations of inheritance to be overcome, when 
trying to compose protocols. As far as we know, 
BAST is the only environment to provide both a 
set of ready-to-use protocol objects for building 
fault-tolerant distributed applications, and a com- 
plete framework based on design patterns, for com- 
posing new protocols from existing ones. We see it 
as our contribution to the design of well-structured 
reliable distributed systems. 

Our current prototype of BAST is fully opera- 
tional and is available for Smalltalk and Java. At the 
moment, inheritance is still partly involved when 
composing distributed protocols; although a minor 
drawback, this does not make protocol composi- 
tion as flexible as one might expect. This is due to 
the fact that programmers have to know something 
about the implementation of the protocol classes 
they reuse, namely their inheritance relationships. 
This is not surprising, since inheritance is known 
to violate encapsulation and to hinder modularity. 
Future work will consist of trying to decide if get- 
ting rid of inheritance, at least as far as protocol 
composition goes, is a good way to achieve even 
more flexibility. We are also extending BAST with 
new protocol classes, supporting frequently used 
protocols in reliable distributed systems, and op- 
timizing existing protocol classes to improve per- 
formance. Further information about BAST can be 
found at htt p:-/lsewww.epfl.ch/bast; our public-free 
implementation is also available there. 
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Abstract 


We propose the peer object-group design pattern as a 
suitable architectural solution to structure and im- 
plement synchronous groupware applications. We 
discuss a reliable group-communication subsystem 
and a distributed objects model, implemented in 
Java, used to realize the approach. 


1, Characterizing Groupware 


Computer Supported Cooperative Work" - CSCW, 
deals with the use of computer systems by people to 
cooperatively work on common tasks. Groupware is 
software specially built to allow people to work co- 
operatively. 


Groupware and user interaction can be roughly clas- 
sified in two broad classes - asynchronous or differ- 
ent-time, and synchronous or same-time. When using 
asynchronous groupware, users work not necessarily 
in the same time-frame and interact for long periods 
of time (e.g. in the joint development of a software 
project). When using synchronous groupware users 
work in a tightly-coupled manner during relatively 
short common time-frames (e.g. during a distributed 
meeting). The synchronous and asynchronous coop- 
eration paradigms are not alternatives, but rather 
complementary; real work is most often performed 
alternating asynchronous work with synchronized 
periods. 


We are in the process of investigating generic sys- 
tem-level services to support and allow simple de- 
velopment of robust groupware applications. In this 
text we will focus on system support for synchronous 
groupware. In particular, we will discuss structuring 
and programming abstractions based on group- 
communication and object-groups specially devised 
to help in the development of synchronous group- 
ware applications (SGA). 


A virtually common feature to all SGA 1s the provi- 
sion of a shared workspace which users use to com- 
municate and cooperate during a synchronous ses- 
sion. For acceptable productivity, users need to have 
an accurate notion of what the state of the shared 
workspace 1s. In particular, users should have mutu- 
ally consistent views of the state of the workspace 
and should see each others actions as soon as possi- 
ble. SGA are also interactive applications by nature, 
SO 1t is required that the system responds and evolves 
accordingly to users expectations [1]. Users desire 
short or immediate response times; preferably, simi- 
lar to that found in single user applications. Users do 
not find acceptable to wait a considerable amount of 
time to perform some operation (e.g. to update a 
shared object). Because users tend to divide/phase 
tasks into smaller sub-tasks and user communication 
and cooperation has multiple facets, SGA should be 
seen not as monolithic applications but rather as a 
collection of tools aggregated in the context of a sin- 
gle session (e.g. including a tool for shared drawing, 
a tool for message exchanging, a tool for text or 
document editing, tools providing audio and video 
channels, user activity awareness, coordination, etc.). 
From a software-engineering perspective, it 1s also 
preferable to use a generic multi-tool approach than 
to provide all the functionality from scratch in every 
application. 


2. Design Alternatives to Support Dis- 
tributed Synchronous Groupware 


A commonly used architectural approach to support 
distributed SGA is the client-server paradigm. A 
central server is used to manage the shared work- 
space, to perform concurrency control on user ac- 
cesses, and to provide other session related services 
(e.g. user activity awareness). User processes use the 
server to operate on shared resources and to dissemi- 
nate information to other users. While this is a very 
well understood paradigm, and it is simple to realize, 
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it presents major drawbacks: fault-tolerance and 
scalability, since it 1s based on a central server. 
Moreover, performance can be somewhat injured by 
this architectural approach, although clients may 
replicate/cache parts of shared workspace in order to 
mitigate the problem. A variation is the centralized 
application-distributed interface approach, where a 
single application multiplexes user interaction and 
disseminates output across several user interfaces. It 
presents the same problems as the client-server ap- 
proach, and is in general less flexible. 


An alternative is the replicated-server, or object- 
group, approach. A group of servers actively repli- 
cates objects and/or service state. Even if a subset of 
servers crashes or becomes unreachable, the service 
will be available as long as some of them remain 
reachable (one or the majority - depending on con- 
sistency criteria). This approach is very suitable for 
many distributed fault-tolerant services, but still pre- 
sents some drawbacks in the context of SGA. Be- 
Cause users want to have accurate views of the 
shared workspace, and because users actions are 
largely driven by other users actions, extra mecha- 
nisms for event notifications are required. 


Preliminary experience on scalable, fault-tolerance, 
distributed systems has suggested that migrating 
complex system functionality from servers to clients 
may be a suitable design option. This argument, the 
need for flexibility in tool building, and the low- 
latency requirements of SGA, suggests, in our view, 
a much more natural approach - the peer object- 
group approach. 


3. The Peer Ob ject-Group Design Pattern 


In the peer object-group approach the shared work- 
space managed by SGA is materialized as a collec- 
tion of objects replicated amongst users local envi- 
ronments. Each local environment holds a replica for 
every object the associated user is currently access- 
ing or working on. The set of replicas for a given 
object constitutes a (peer) object-group. Consistency 
amongst the replicas is kept by a _- group- 
communication subsystem implementing appropriate 
consistency criteria. Shared objects are mapped to 
object-groups and operations on the objects are 
mapped to (reliable) multicast operations. Figure 1 
schematically illustrates the model. 


Users gain access to objects by dynamically joining 
the corresponding object-groups - which may involve 
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the transparent transfer of the object's current state to 
the local replica. When no longer interested in the 
objects, users leave the object-groups. 


Users keep accurate views of the shared state since 
updates are received by all object-group members; 
no provisions for additional notification mechanisms 
is required. Latency in object manipulation is im- 
proved because no intermediate entities are present. 
Fault-tolerance and availability is also improved; a 
K-degree of fault-tolerance is achieved as long as 
K+1 members keep copies of shared objects. 
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Figure | - The peer object-group design pattern. 


Different shared-objects may have different replica- 
tion consistency requirements, meaning that the un- 
derlying group-communication sub-system should 
provide group-protocols with different service se- 
mantics. The selection of object-groups granularity 
must inevitably be tool and protocol driven; the 
lighter-weighted the protocols are, the finer the 
granularity can be. 


Object persistence is not addressed by this bare 
model. While it is desirable that some objects out- 
live sessions, that facility 1s provided by the asyn- 
chronous groupware support, and will not be dis- 
cussed in this paper. 


4. Object-Oriented Group-Protocol Im- 
plementation and Composition 


The essential component to realize the peer object- 
group approach is the group-communication subsys- 
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tem, which provides group membership and message 
passing services. Those services are implemented by 
group-protocols and accessed through the combined 
use of a user-to-protocol service request interface 
and a protocol-to-user event notification interface. 
The former includes methods for message sending 
and multicasting as well as group management. The 
later includes methods for message delivery and 
group membership view change notifications. 


Implementing group-protocols is a complex task, 
mainly because they must convert an unfriendly 
system environment into a friendly one. A good en- 
gineering option is to use a modular approach in de- 
signing and implementing group protocols. Multiple 
(micro-)protocol layers, each implementing some 
specific service, are stacked to built complex proto- 
col services [2]. 


To realize the peer object-group we have imple- 
mented an object-oriented framework to allow the 
convenient implementation and composition of 
group-protocols. Because we want to maximize 
flexibility, allow application and system components 
to be loaded on-demand, and support heterogeneity, 
the Java language was a natural choice [3]. The inte- 
gration with the Web was an additional motivation. 


In our framework protocol layers are implemented as 
objects of special classes, which implement group 
services related programming interfaces. Complete 
protocol structures (or stacks) are built attaching 
protocols objects together. To allow simple con- 
struction of protocol structures, protocol structures 
description strings and generators are used. Descrip- 
tion strings convey information about which proto- 
cols should be used to build a particular protocol 
structure and the topological relationships between 
the layers. Protocol structure generators parse strings 
and generate the correspondent protocol structures, 
by dynamically loading the layer classes and creating 
the layer objects. 


In implementing specific group-protocols we have 
considered SGA specifics. Because users objects 
working-sets are expected to change often during the 
lifetime of a session and users should be able to enter 
and leave sessions dynamically, dynamic lightweight 
group membership services were used. In particular, 
we have specified a new membership and reliable 
multicast service semantics - linear convergent syn- 
chrony, which is weaker than the "standard" view 
synchrony [4], but can be implemented by protocols 


which incur in less overhead for group membership 
management. The semantics and implemented proto- 
col are linear, in the sense that no view merging is 
allowed, because we assume that state reconciliation 
due to network partitions is performed using the ex- 
ternal data storage services. The protocol uses a spe- 
cially tailored FIFO reliable multicast protocol. 


We have also experienced with optimistic ordering 
techniques to reduce system response-time. In par- 
ticular, the Undo/Redo delivery paradigm was used 
to reduce update latency [5]. In this paradigm mes- 
sages are delivered locally while asynchronously 
multicasted to the group. If ordering conflicts arise, 
some previously delivered messages/updates are un- 
done. Object operations semantics (e.g. the commu- 
tative property), 1s explored to reduce the probability 
of conflicting updates. The protocol sits on top of a 
sequencer based total ordering and state transfer 
protocol, which achieves high levels of concurrency 
even during process joins. 


5. Object-Groups Management and Ses- 
sion Services 


In addition to a group-communication subsystem 
SGA programming can benefit from the provision of 
other more specific services. Mechanisms and serv- 
ices are required for the naming and binding to ses- 
sions, for the management of the object-groups in the 
shared workspace, and to enable user activity aware- 
ness. 


We have defined and implemented an extensible 
distributed object model to structure and implement 
SGA, which tackle the above issues in an integrated 
manner. In addition to the collection of peer object- 
groups which constitutes the shared workspace, we 
have introduced the notion of fully replicated Session 
object. A Session object is supported by a special 
bootstrap object-group, which all user processes 
must join to enter a session. Binding information 
required to enter a session is fetched from an external 
binding service. 


A Session object's main purpose is to store and man- 
age directories which hold information about created 
object-groups and users participating in the session. 
Object-groups information includes binding and 
management data (e.g. protocol structures descrip- 
tion and replication options). User information in- 
cludes human readable data about human users (e.g. 
user full name, user photography, e-mail address, the 
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Web home-page, etc.). Both object-groups and users 
are identified by names, and are represented as Java 
classes which can be application derived to convey 
additional information. Conceptually, we abstract an 
application as a collection of shared object(-groups) 
and users organized around the fully replicated Ses- 
sion object. Figure 2 depicts an intuitive view of the 
distributed objects model. 


From an application programmer perspective, she/he 
can invoke the methods of a Session object to create, 
destroy, join or leave object-groups and to obtain 
information about users. A reactive programming 
Style can also be used to act on session related events 
(e.g. a user entering or leaving a session, or an ob- 
ject-group being created or destroyed). 
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Figure 2 - Objects conceptual model. 


6. Experience and Future Work 


As described, we have developed up to this point the 
framework for protocol composition, a set of stack- 
able group-protocols and the distributed objects 
model. We have tested the suitability of our ideas 
implementing a demo white-board tool. It is a simple 
tool which manages a shared drawing canvas and 
requires only one object-group to be implemented. It 
was tested with only a small number of users in a 
local network. In this restricted setting, system re- 
sponse has reveled to be quite acceptable, i.e. system 
performance did not suffer significant degradation 
when operating on replicated shared objects. Addi- 
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tional experience and performance measures are re- 
quired to analyze system behavior in more general 
environments. 


Many potential work directions were revealed during 
the course of our work. We plan to continue the 
process of specifying suitable group-communication 
semantics and implementing new protocols. In par- 
ticular, we expect to develop layers for light- 
weighted groups - multiplexing many groups into a 
small number of groups, in order to increase system 
performance when many fine granularity object- 
groups are used. This may call for the definition of 
multiple-group service semantics. The issue of fail- 
ure-detectors consistency will also be addressed, i.e. 
ensure that the membership layers of several protocol 
stacks have similar views of what the connectivity 
State is. Also, we expect to tackle the always impor- 
tant issue of security and access control. 


We also plan to develop additional tools and appli- 
cations, to help validating more clearly the useful- 
ness of the abstractions discussed in this paper. We 
will consider enhancing our object model with addi- 
tional structuring abstractions and common services 
as more experience is gained. Finally, we intend to 
build a stub-compiler to simplify the task of shared 
objects programming. 
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1 Overview 


Several application domains such as finance, pro- 
cess control, and telecommunications, have strong 
reliability requirements. Typically, such applica- 
tions tend to avoid having a single point of failure, 
and need to communicate with reliable primitives 
that prevent message loss and ensure atomicity gu- 
arantees. Among such applications, we have focu- 
sed on reliable notification-based applications, such 
as trading systems and news agencies, where produ- 
cers need to reliably deliver information to a set of 
consumers. Developing such applications is grea- 
tly eased with a middleware providing reliable bro- 
adcast semantics [7]. 


Group-oriented systems like Isis [3], Horus [9], 
Totem [2] or Transis [1], provide reliable broadcast 
primitives and are generally considered to be good 
candidates for implementing reliable notification- 
based applications. Nevertheless, these systems are 
proprietary solutions with limited portability and 
interoperability. Although efforts have been made 
recently to achieve better modularity (e.g., in Ho- 
rus), the infrastructure of group-oriented middle- 
wares usually consists of several layers that are not 
necessarily required at upper levels and usually turn 
out to be performance penalizing. 


Orbix+Isis [8] is an effort at supporting replica- 
tion of CORBA objects transparently, by integra- 
ting Isis in Orbix. This approach requires a mo- 
dification of the ORB and leads to a non-standard 
and non-interoperable solution. The Object Group 
Service [6] provides replication of CORBA objects 
without using heavy-weight group communication 


toolkits (e.g. Isis) and would provide the degree 
of reliability required by our application class. The 
tradeoff is performance degradation since it intro- 
duces replicated intermediary objects. 

We present here a way to augment CORBA with 
a reliable broadcast facility. Our approach 1s pra- 
gmatic in the sense that it requires no modification 
of the Object Request Broker, and we do not bu- 
ild a new CORBA service from scratch. Instead, 
we add reliability features to the existing CORBA 
Event Service, which already provides multicast- 
like communication. The extension we introduce 
requires no modification of the CORBA specifica- 
tion, and can be applied to any standard Event Ser- 
vice implementation, without any communication 
overhead. The resulting service, called Reliable 
Event Service, adequately fits the required seman- 
tics of reliable notification-based applications. It 
constitutes an interesting light-weight and open al- 
ternative to existing group-oriented systems. 


2 Reliability Issues 


We consider notification-based applications where 
communication is decoupled between consumers 
and suppliers of information, with specific reliabi- 
lity requirements. This type of applications 1s 
widespread in domains like process control, fi- 
nance, or telecommunications. 

The use of CORBA for such applications bears 
many advantages over other approaches. The por- 
tability and interoperability aspects of CORBA are 
strong assets. The paradigm offered by the event 
channels is well adapted to notification-based ap- 
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plications since it provides a flexible model for 
asynchronous communication among distributed 
objects. Furthermore, relying on one-to-one com- 
munication to implement this functionality would 
require some amount of bookkeeping to keep track 
of the consumers. Finally, depending on the imple- 
mentation, there 1s a potential for the Event Service 
to be scalable while it is clearly not the case with 
one-to-one communication primitives. 


2.1 Limitations of the Event Service 


Since the Event Service is based on a centrali- 
zed architecture, where a channel is just another 
CORBA object, it introduces a single point of fa- 
ilure. Furthermore, the CORBA specification is 
vague conceming the quality of service provided 
by event channels. It states that the Event Service 
does not need to provide stronger semantics than 
“best-effort” delivery of the events, although imple- 
mentors of the Event Service are advised to pro- 
vide various semantic levels for their channels. The 
problem of the centralized architecture of the event 
channels may in two ways: 


e Replicate the event channels. Event chan- 
nels are replicated, and hence, are no longer a 
single point of failure. This approach, used in 
Isis News [3], requires the use of specific pro- 
tocols, like group communication [4], to keep 
replicated objects consistent. 


e Decentralized architecture. In a decentrali- 
zed architecture an event channel 1s not imple- 
mented as a single physical object but rather 
as a collection of collaborating objects. This 
approach makes it possible to build a proto- 
col based on IP-multicast rather than point- 
to-point communication, thus improving effi- 
ciency and scalability. 


A solution to the lack of clearly specified se- 
mantics requires the definition of a standard qu- 
ality of service to be expected from any implemen- 
tation of the Event Service and a standard way to 
select it. The specification may define different le- 
vels of quality of service, from which the applica- 
tion programmer may choose. Currently, a valid 
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implementation of the Event Service needs to be at 
least “best-effort”. In other words, it puts no actual 
requirement on the delivery semantics since “best- 
effort” is a subjective description rather than a real 
property. Since a vague and minimal description is 
not suitable for reliable applications, we describe a 
protocol in Section 3 that extends any Event Service 
to make it reliable. 


3 A Reliable Event Service 


We introduce here the Reliable Event Service 
that provides reliable event channels, by exten- 
ding the quality of service of any existing (un- 
reliable) Event Service. The approach we ado- 
pted provides the exact quality of service requ- 
ired by the application class considered, and fo- 
cuses on providing good performances. Further- 
more, it 1s orthogonal to the architecture (centrali- 
zed/decentralized/replicated) of the Event Service 
that it extends. 

The semantics we associate with the Reliable 
Event Service are close to those of a Reliable Mul- 
ticast primitive [7]. This primitive ensures that dif- 
ferent clients receive the same set of messages. An 
informal definition of this primitive could be the 
following: if a correct object multicasts a message 
m, then all correct objects eventually deliver m. 
Furthermore, if a correct object delivers a message 
m, then m was previously multicast by some object 
and all other correct objects will eventually deliver 
m. Briefly, Reliable Multicast has two properties: 
at-most-once and atomicity (all-or-nothing). 

Ideally, we would use a reliable multicast pri- 
mitive, but its strong properties have a very high 
cost in terms of communications overhead. In a 
typical implementation of this primitive, the num- 
ber of messages generated belongs to O(n”) and it 
requires that each consumer keeps a list of all the 
other clients. In the context of a diffusion network 
(e.g., Ethernet) where the complexity of a multicast 
is O(1), the complexity of the reliable multicast is 
still O(7). Since the cost increases proportionally 
with the number of destinations, it is not scalable. 

Our mechanism has weaker properties than a Re- 
liable Multicast, but it suits our requirements for 
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reliability and does not change the complexity of 
the underlying communication. It is split into three 
parts. The first part consists in detecting when a 
message has been lost, in order to emit a notifica- 
tion. The second part helps to reduce the probabi- 
lity of actual loss by retrying unsuccessful transmis- 
sions. Finally, the last part ensures that messages 
are delivered in a FIFO manner. 


3.1 Notification of Message Loss 


Since there is no time bound on the delivery of mes- 
Sages, it is not possible to distinguisha lost message 
from a slow one. Hence, we consider the message 
to be lost in both cases. 

To detect when a message is lost by the chan- 
nel, we add some extra information to each mes- 
Sage: a unique message identifier. Each producer 
has a unique identity given by its CORBA object 
reference. This tag makes messages issued by two 
different producers distinguishable. In order to dif- 
ferentiate messages issued by the same producer, 
we add a second field holding a local identifier (id). 
This id consists of a sequence number that 1s incre- 
mented each time a new message is sent. There- 
fore, clients will eventually detect lost messages ba- 
sed on missing sequence numbers. If the underlying 
event channels are not FIFO, the client may assume 
that a message 1s lost while it is only delayed. In that 
case, the client will launch the replay protocol (see 
below), and discard duplicate messages. 


3.2 Message Replay 


When a client detects the loss of a message, it conta- 
cts the producer by using the CORBA reference em- 
bedded in the message identifier. The client issues 
a request for the lost message using a synchronous 
remote method invocation and waits for a reply (see 
Figure 1). If the producer has not crashed in the 
meantime, the message will be resent and the client 
may continue. If a problem occurs (e.g., the pro- 
ducer has crashed) the reply is an exception and 
the client is supposed to react adequately. This 
approach 1s actually based on the principle of nega- 
tive acknowledgments. In order to be able to resend 


a message, the producer needs to keep a buffer with 
every message it sends. 
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Figure 1: Replay of a lost message. 


When the loss of a message is not recoverable, 
the most sensible approach consists in issuing an 
exception to the handled by the application. Since 
the adequate reaction to such a loss varies from one 
application to another, we leave the responsibility 
of reacting properly to the application programmer. 


3.33. Ensuring FIFO Ordering 


Since our protocol is aimed at working with any 
implementation of the event channels, we face an 
additional problem. If the underlying protocol en- 
sures that received events are delivered in the same 
order than they were sent (FIFO property), replay- 
ing lost messages breaks this property. Hence, to 
avoid this problem, we add a mechanism that gu- 
arantees a FIFO delivery of events. This mecha- 
nism, illustrated in Figure 1, is an adaptation of the 
FIFO multicast presented in [7]. 

We first need to distinguish the reception of a 
message from its delivery. We call recezve(m) the 
reception of the message m by the lower protocol 
layer, and deliver(m) the delivery of the message 
m from the lower layer to the upper layer. In some 
situations (e.g. upon a message loss) a message m’, 
sent after m, may arrive before m. In other words, 
receive(m') precedes receive(m). 

In order to ensure the FIFO property, the delivery 
of m’ is delayed until m has been received and deli- 
vered. This implies that the FIFO order of delivery 
is preserved for the upper layer. In other words, 
deliver(m) precedes deliver(m’). The FIFO pro- 
perty is thus guaranteed by our protocol, whether or 
not the underlying communication channel delivers 
the events in a FIFO order. 
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3.4 Appropriate Reaction 


When a message has been lost and is no more avail- 
able, the client has to react accordingly. The most 
appropriate reaction depends on the application. A 
non-exhaustive list of possible reactions to the loss 
of a message is: 


[Ignore (trivial case). The lost message is igno- 
red. There was no need for our protocol and 
reliability is not necessary. 


e Quit. The client is considered faulty, and 
hence, decides to commit suicide. 


e Quit & Recover. The client is considered 
crashed, but it subscribes again to the event 
channel, as if it were just starting to listen to 
the event channel. In the initialization phase, 
a producer may send initial information to the 
newcomer. 


e Warning. A waming message is issued to the 
end-user, telling that some information might 
not be up-to-date. 


To satisfy the needs of a large number of appli- 
cations, the most sensible approach consists in 1is- 
suing an exception whenever a message cannot be 
retransmitted. This leaves the responsibility of rea- 
cting properly to the application programmer. 

In order to guarantee the atomicity of delivery, it 
is necessary for the client not to be considered cor- 
rect when it fails to deliver a message. Therefore, 
the only reactions that guarantee atomicity are “Qu- 
it” and “Quit & Recover”. 


4 Implementation Issues 


When evaluating the relevance of using a middle- 
ware for the development of notification-based ap- 
plications, one of the main concems is to rely on a 
standard definition rather than features specific to 
a particular vendor. Since these applications are 
expected to evolve over a long period of time, por- 
tability is a strong requirement. 

Our current implementation suffers from a num- 
ber of limitations inherent to the underlying Event 
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Service that we use, i.e. IONA’s OrbixTalk. In par- 
ticular, it supports only the push model defined in 
the Event Service specification, and does not allow 
to chain event channels (1.e., there must be at most 
one event channel between a consumer and a sup- 
plier). More information can be found in [5] 
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Abstract 


As more and more computers and workstations enter 
the workplace they are inevitably connected to a 
network. Networks provide the interconnection nec- 
essary for computers to share common data, periph- 
erals and other system resources. Distributed com- 
puting allows network applications to access func- 
tions or processes on remote computers. Developing 
network applications specifically to interact and draw 
upon resources of multiple computers creates the 
groundwork for a distributed system or distributed 
computing environment (DCE). 


An ideal distributed system is self monitoring and 
resilient to failures. In the event of a failure the sys- 
tem should dynamically reconfigure itself with auto- 
matic fail-over for applications that fall victim to the 
fault. Transparency of fault tolerant mechanisms 1s 
desirable, especially when introducing legacy appli- 
cations into the distributed system. The reduction of 
application development efforts heavily relies on the 
availability of portable, non-invasive, fault tolerance 
providing extensions, which introduce mechanisms 
for uninterruptible service by insertion into existing 
distributed applications. 


To test the potential for addressing some of these 
desired capabilities for a distributed system imple- 
mented within a CORBA distributed computing envi- 
ronment, the Interactive-Group Object-Replication 
(IGOR) system was developed. IGOR is a system of 
objects that provides fault tolerance through object 
replication by arranging replicas in fault tolerant 
groups which interact to provide access to redundant 
data and services. For purposes of portability, in- 
teroperability and to evaluate the CORBA environ- 
ment, IGOR was designed with the constraint of ly- 


ing entirely within the CORBA architecture and us- 
ing IJOP as the communication protocol. ‘This 
guarantees its portability over changes in platform 
and network technologies. The IGOR system is re- 
configurable and its fault tolerance mechanisms are 
completely transparent to client applications. 


1. Introduction 


A plethora of computers and workstations enter the 
workplace each year and play an increasingly 1m- 
portant role as a digital tool used by people world- 
wide. With the increase of the use of computers 
comes the increase of stored information and multi- 
client services. However experience has shown that 
such information sources and services quickly be- 
come decentralized, then isolated and as a result, not 
interchangeable. Isolation is due to the disintegration 
of applications segregated by disparate hardware and 
Operating systems, which lack the interoperability 
and robustness necessary for seamless information 
and service sharing. 


A distributed computing environment (DCE) implies 
an environment in which interoperation is not only 
possible, but is fundamentally inherent and natural. 
Distributed computing exploits the computational 
power of many computers, integrating the entire 
system into a single functional unit. Load balancing, 
parallel processing and distributed objects are major 
technologies that have evolved which aid in the 1m- 
plementation of full fledged distributed systems. 


An ideal DCE provides the programmer with auto- 
mated tools that permit construction of distributed 
applications (clients and servers). Distributed appli- 
cations access functions on remote computers while 
giving the appearance of locally executed functions. 


" This research supported by Lockheed Martin Corporation, Government Electronic Systems 
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Furthermore, the user need not be aware of the re- 
mote or local nature of the processes, nor the move- 
ment and conversion of the data involved. 


Usually the preconceived notion of a distributed sys- 
tem is one in which a specific task is being carried 
out by a small set of networked computers running 
the same operating system. That may have been true 
several years ago, but today a distributed system can 
be much more diverse. Distributed systems can span 
many computers and interoperate with virtually any 
operating system or hardware platform available on 
today's market. The magnitude of distributed sys- 
tems varies greatly depending on the intended pur- 
pose. A distributed system can contain a few in- 
teroperating computers, or may span the entire Inter- 
net. 


2. Overview 


A component or object based architecture can greatly 
benefit a distributed system. Construction of a sys- 
tem with objects vastly increases modularity; the 
interchangeable nature of individual system compo- 
nents translates into design and implementation 
flexibility for the applications engineer. Ideally, up- 
grades to the system can be done on a component 
basis, while the system remains up and running. 
Zero downtime is a very attractive feature of an ob- 
ject based distributed system, especially for mission 
critical applications. Providing uninterruptible sys- 
tem-wide service is a difficult and complex task. 
Many standards have been developed to aid in the 
composition of such systems. 


One of the distributed systems standards that has 
been devised since the emergence of the object para- 
digm is called the Common Object Request Broker 
Architecture (CORBA) [2]. CORBA is one compo- 
nent of the Object Management Architecture (QMA) 
and was developed by the Object Management Group 
(OMG) [1]. CORBA is a well defined robust stan- 
dard that is component (object) based, is supported 
on virtually all hardware platforms, and is fully in- 
teroperable and portable. 


On the surface, the CORBA standard for distributed 
objects appears to be another type of Remote Proce- 
dure Call (RPC) [3] implementation. On closer in- 
spection, CORBA proves to offer much more than 
just predefined procedure calls with static parame- 
ters. Execution of remote objects, parameter mar- 
shaling, multiplatform interoperation, interface port- 
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ability, dynamic interface invocation, variable pa- 
rameters and platform independent data types are 
some of the features of CORBA that do not exist in 
standard RPC, OSF/DCE or message passing stan- 
dards, such as Message Passing Interface (MPI) [5] 
and the like. 


While many aspects of CORBA are attractive for 
large scale distributed system development, it does 
not inherently support more than rudimentary levels 
of fault tolerance. To implement basic fault tolerance 
in CORBA without involving external mechanisms, 
server applications can be cloned and distributed 
throughout the network to provide high availability 
of services to clients. Cloning merely creates redun- 
dant copies of the server application, so services are 
more readily available to client applications. Unfor- 
tunately this does not take into account the mutable 
state of the application at hand. Since no data syn- 
chronization takes place between clones, this presents 
a low level of fault tolerance and is unacceptable for 
many fault critical applications. 


During failures, certain CORBA Object Request 
Brokers (ORBs), such as Visigenic Software’s Visi- 
Broker [6], can automatically fail-over to an applica- 
tion that can provide a desired service. However 
there is no guarantee of appropriate object state. 
Fail-over occurs transparently to client applications, 
thus providing a layer of isolation between the client 
and the system's fault tolerant mechanisms. 


Replicating server applications is another technique 
similar to cloning that provides high availability of 
services to clients. Object replication is meant to not 
only provide availability of services, but also to 
maintain strict data consistency between objects [4]. 


3. IGOR Architecture 


In response to the need for a fault tolerant system and 
our desire to test CORBA with respect to its support 
for easily insertable object behavior extensions, we 
have developed the IGOR (Interactive-Group Object- 
Replication) system. Object replication was selected 
as the basis for implementation of IGOR's fault toler- 
ance mechanism. IGOR is a system of interacting 
objects that provides mechanisms for the creation of 
a reconfigurable fault tolerant system, with the addi- 
tional and strategically important constraint of lying 
entirely within the CORBA architecture. Layered on 
CORBA, IGOR yields a portable, interoperable and 
modular design, that will remain portable over 
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changes and improvements in the CORBA standard 
and changes of platform, operating system and com- 
munication technologies. 


Much of our attention has been directed towards de- 
velopment of a fault tolerant system which reduces 
the invasiveness of the underlying fault tolerant 
mechanisms with respect to implementation and op- 
eration. Our aspiration for IGOR was to provide a 
tool that would ease the development of fault tolerant 
distributed applications, while simultaneously em- 
bracing the introduction of legacy (CORBA and non- 
CORBA) applications into the fault tolerant arena. 


An object grouping scheme has been devised for the 
IGOR system to facilitate fault tolerance by redun- 
dancy (object replication). Distributed replica server 
applications enroll themselves into fault tolerant 
groups through an IGOR registration process. These 
groups are actually logical representations that asso- 
ciate like server applications that share a common 
data set. Each fault tolerant group functions sepa- 
rately as a single logical unit and group members 
interact with each other to maintain intragroup data 
consistency. Client applications benefit by the 
group's high availability of services and redundant 
data. 


Fault tolerant groups in IGOR are resilient to partial 
failures and provide these fault tolerant services 
transparently to client applications. In fact, the client 
object does not need to be aware that the server ap- 
plication which it is accessing 1s a member of a fault 
tolerant group. The client code is identical whether 
the client object is connected to an IGOR fault toler- 
ant object or a single non-fault tolerant object. Be- 
cause of this flexibility, fault tolerance may be added 
to the system even after client applications have al- 
ready been deployed. This is done by simply re- 
placing the non-fault tolerant server objects with their 
IGOR fault tolerant counterparts. No code modifica- 
tion or recompilation of the client application is re- 
quired for the addition of IGOR fault tolerance. 


A single IGOR Registry Service acts as the govern- 
ing body for fault tolerant group enrollment. The 
purpose of the registration process with the IGOR 
Registry is to ensure that fault tolerant groups consist 
of only objects of identical type. The IGOR Registry 
is responsible for recording the logical arrangement 
and association of all replica groups; of course this 
information is persistently stored concurrently in a 
redundant object database. To keep track of all the 
fault tolerant replica groups the Registry constructs a 
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binary tree consisting of all groups, this binary tree 
being called the Group Tree. A single Group Tree 
represents active fault tolerant groups for the entire 
IGOR system. The purpose of the Group Tree is 
only to facilitate organizing replica groups in a 
meaningful fashion to ease the Registry's task of 
group management. To arrange group members, 
each node on the Group Tree contains a balanced 
binary sub-tree of replicas for the group; this sub-tree 
is called the Object Tree. An Object Tree logically 
represents the group membership of a single fault 
tolerant group. 


In the event of a Registry failure, a new Registry is 
launched and retrieves fault tolerant group informa- 
tion from the object database. To further protect the 
system, each member of a fault tolerant group caches 
a local copy of information regarding the current 
status of the group memberships. This decouples the 
fault tolerant groups from the Registry, therefore, the 
system can continue to operate even in the absence of 
the Registry. 


A set of IGOR objects harboring methods for fault 
tolerance are integrated into server applications, this 
alleviates the burden which would otherwise be 
placed on the server application to implement all 
fault tolerance mechanisms. These IGOR objects 
handle fault tolerant group membership related func- 
tionality, object monitoring and perform intragroup 
communication transparently from within the server 
application. IGOR objects have their own CORBA 
interfaces and converse amongst each other to per- 
form maintenance tasks, such as reconfiguration and 
message propagation. Intragroup propagation uses 
the branches of the Object Tree (binary tree) as 
communication paths to ensure a single and complete 
group propagation of all messages. The Object Tree 
evenly distributes the burden of messaging to all 
group members. The responsibilities of transaction 
processing is shared between the IGOR objects and 
the server application. The Two-Phase Commit [3] 
protocol was used for the transaction processing as- 
sociated with object state transfer among replicas. 


A system should not render itself inoperable because 
of a partial network failures or a few downed com- 
puters. Fortunately, IGOR's redundant component 
design protects against such problems by automati- 
cally reconfiguring itself when failures occur. This is 
possible since the IGOR system 1s self monitoring 
and can quickly detect problematic objects and adjust 
accordingly. 
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4. Implementation Experience 


Originally, the intention was to have the server appli- 
cation inherit fault tolerance mechanisms through a 
standard IGOR Object (C++ base class) with a 
CORBA interface. The inheritance approach proved 
to be infeasible due to some limiting constraints of 
the CORBA standard. Inheriting the IGOR Object 
class would entail inheriting both its IDL interface 
and the code associated with its fault tolerant mecha- 
nisms into a server application (which has its own 
IDL interface and associated code). In its current 
state, CORBA does not support multiple implemen- 
tation interface inheritance within a single object. 
CORBA does support multiple interface inheritance 
within IDL, however, applications cannot inherit 
from multiple implementations of interfaces. 


Consequently, we decided to incorporate the IGOR 
Object within the server application as a class mem- 
ber. Although not the original intent, inclusion of the 
IGOR Object as a class member yields a tightly cou- 
pled link between the server application and the 
IGOR Object. As a result, management of fault tol- 
erant operations are largely performed by the IGOR 
Object on behalf of the server application, despite the 
inability to directly inherit such functionality. Cou- 
pling of the IGOR and server objects fuse the two 
objects together with interaction between them medi- 
ated by standard C++ method invocations. On the 
other hand, each object uses its own CORBA imnter- 
face for remote communications. 


As much fault tolerant code as possible has been off- 
loaded to the IGOR Object. Unfortunately all the 
fault tolerant code cannot be handled by the IGOR 
Object alone. Certain aspects of server specific in- 
formation (server's mutable state) must be handled by 
the server application in conjunction with the IGOR 
Object's involvement. The server's object state type 
cannot be known to the IGOR Object. Although the 
IGOR Object plays an important role in maintaining 
the data synchrony of the fault tolerant group, it does 
it somewhat blindly with respect to the actual data 
that is being transmitted. The IGOR Object knows 
nothing of the server’s data, but it takes control of 
moving the data by instructing the server with respect 
to where to send or get state information. Again, 
intragroup state exchanges follow the branches of the 
Object Tree for communication. 


Installing IGOR fault tolerance into a server applica- 
tion requires inheritance of a transaction processing 
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class and creation of additional proxy methods to aid 
the IGOR Object in intragroup data transfer and 
transaction processing. Such proxy methods would 
not be required if direct object implementation in- 
heritance was possible. Also, an object state class 
called StateKeeper is inherited, which contains meth- 
ods for mutex locking and unlocking to protect the 
object’s state from multiple thread access. 


5. Conclusion 


Conception of IGOR was a result of the need for fault 
tolerance in a standard operating environment. 
CORBA provides the means to realize fault tolerant 
distributed systems; IGOR capitalizes on the power 
and flexibility of CORBA to provide tools to aid in 
the creation of a fault tolerant system. 


We were successful in completely isolating the client 
applications from all fault tolerance mechanisms by 
embedding fault tolerance functionality into the 
server applications, using pre-defined IGOR objects 
to perform a majority of the work. 


Hopefully, CORBA's evolution will encompass the 
capabilities that make the design of fault tolerant 
systems less complex and less performance costly. 
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Abstract 


The Eternal system is a CORBA 2.0-compliant system that 
enhances the CORBA standard with replication and thus 
fault tolerance. The novel interception approach imple- 
mented in the Eternal system involves capturing IIOP- 
specific system calls made by the ORB, and subsequently 
mapping these calls onto a reliable multicast group commu- 
nication system. The motivation for the use of this approach 
is that fault tolerance is transparent to the application ob- 
jects, as well as to the ORB, and that any commercial ORB 
can be used with no internal modification. The intercep- 
tion approach exploits the performance of the underlying 
multicast group communication system to provide good 
performance. 


1 Introduction 


The incorporation of the object-oriented paradigm into the 
distributed computing model has resulted in the develop- 
ment of distributed object applications. Such applications 
must be portable, and the objects of the application must 
be able to interoperate when distributed across heteroge- 
neous platforms with diverse hardware and software. The 
need for a standard that provides these features has led to 
the development of the Common Object Request Broker 
Architecture (CORBA). 


While the CORBA standard provides for interoperability, 
language transparency, location transparency and portabil- 
ity, it does not address the issue of fault tolerance. Since 
there is an increasing need for reliable distributed object 
applications, current research is focusing on adding fault 
tolerance to CORBA. 


*Research supported in part by DARPA grant N00174-95-K-0083 and by 
Sun Microsystems and Rockwell Intemational Science Center through the 
State of Califomia MICRO Program grants 96-051 and 96-052. 
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2 CORBA and IIOP 


CORBA 1s a standard for communications middleware that 
defines interfaces to distributed objects and that provides 
mechanisms for communicating operations to objects by 
means of messages. The central idea of CORBA is the 
Object Request Broker (ORB), which mediates communi- 
cation between client and server objects. All of the requests 
to, and responses from, the distributed objects are passed 
through the ORB. 

To facilitate the interworking of commercial ORBs de- 
veloped by different vendors, the CORBA 2.0 standard 
defines the Internet Inter-ORB Protocol (IIOP). IIOP allows 
objects operating over heterogeneous IIOP-compliant ORBs 
to interact with each other, irrespective of the internal struc- 
ture of the ORBs or of any vendor-specific mechanisms. 
IIOP has a simple and generic interface that is designed 
to facilitate communication between hererogeneous ORBs. 
The IJOP-specific system calls invoked by the ORB are 
intended for the underlying TCP/IP layer. 


3. The Eternal System 


The Eternal system is a CORBA 2.0-compliant system 
that enhances the CORBA standard with fault-tolerance 
capabilities. Eternal exploits the facilities of an underlying 
multicast group communication system, in our case Totem, 
to provide CORBA-based applications with fault tolerance. 
In addition to providing reliable totally ordered multicasting 
of messages of the ORB, Totem provides mechanisms to 
deal with membership changes that occur when processors 
or processes fail, or the network partitions. 

The Eternal system interfaces with the process group 
layer of Totem. The process group layer provides a simple 
set of group communication primitives and hides the imple- 
mentation details of the underlying Totem protocols. Any 
multicast group communication system with an interface, 
membership services and guarantees similar to Totem, can 
alternatively be used. 
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4 Approaches to Fault Tolerance 


Initial efforts to enhance CORBA with fault tolerance have 
taken an integration approach, with the reliability mecha- 
nisms incorporated into the ORB itself. With the advent 
of Object Services in the CORBA standard, other research 
efforts have taken a service approach, with the provision of 
areliable object group service as part of the Object Services. 
To achieve the best of both of these previous approaches, 
we have adopted a novel ‘‘interception’’ approach. 

These three different approaches are discussed briefly 
below and are illustrated in Figure 1. In all of these ap- 
proaches, replication is employed to provide fault tolerance. 
The replicas of an object are considered to be members of an 
object group, where all of the replicas in the group have the 
same state. Requests can be conveyed to all of the replicas 
of an object by addressing the object group as a whole. 


4.1 The Integration Approach 


The integration approach [4], as implemented in the Electra 
ORB, as well as in Orbix+lIsis, involves layering the ORB 
over a reliable ordered multicast group communication sys- 
tem. To enable the ORB to communicate its messages over 
the underlying system, adaptor objects are interpositioned 
between the reliable multicast system and the ORB. The 
mechanisms for the replication of objects and for the con- 
sistency of the replicas are embedded within the ORB, thus 
requiring internal modification of the ORB. The advantage 
of this approach is that it ensures transparency of the fault 
tolerance to the application objects since all of the neces- 
sary mechanisms are incorporated into the ORB itself. The 
application objects simply use the ORB as a communication 
path for their requests and responses. 


4.2 The Service Approach 


The service approach, as implemented in the Open- 
DREAMS project [2], involves providing an Object Group 
Service as part of the suite of Object Services that are 
defined by CORBA. The application objects convey their 
invocations and responses, via the Dynamic Invocation In- 
terface (DII) and the Dynamic Skeleton Interface (DSI), to 
their associated OGS objects, which then coordinate with 
each other to perform the operation on the replicas of the 
object and to return the results appropriately. The advantage 
of this approach is that it is wholly compliant with the 
CORBA standard and requires no proprietary mechanisms. 
However, the fault tolerance is now visible to the applica- 
tion objects since the application objects must be aware of 
the existence of the OGS objects in order to utilize their 
services. 
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Figure 1: Different approaches to reliable CORBA. 
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4.3 The Interception Approach 


The interception approach, as implemented in the Eternal 
system [7], involves capturing the system calls of the objects 
hosted by the ORB. The intercepted calls, which were 
originally directed by the ORB to TCP/IP, are now mapped 
onto a reliable ordered multicast group communication 
system. The advantages of this approach are that neither 
the ORB nor the objects need ever be aware of being 
‘‘intercepted”’ and, thus, the fault tolerance 1s not visible to 
the application objects. Furthermore, the internal structure 
of the ORB requires no modification since the mechanisms 
that provide reliability are external to the ORB. 


5 The Interception Approach in Eternal 


5.1 “Catching” IIOP System Calls 


Every CORBA object, on its creation, 1s associated with 
a unique Unix process identifier pzd. Using user-level 
extensions [1] to the operating system, the system calls of 
the object can be traced using the file /proc/pid, which is 
a part of the /proc interface in Unix. The system calls of 
these objects can be monitored and captured. In addition, the 
arguments of these intercepted system calls can be modified 
before the calls are allowed to proceed to the operating 
system. 

The Eternal Interceptor “‘catches'’ the calls made by the 
ORB, via IIOP, to TCP/IP. These system calls are then 
mapped onto the routines of the process group interface 
of the Totem system, which assumes the responsibility for 
multicasting messages. Of interest to us are only those 
system calls that are invoked by the ORB for establishing 
connections between objects and for maintaining the inter- 
action of objects on these connections. Thus, the Eternal 
Interceptor catches system calls such as open(), close(), 
read(), write() and poll(), which involve TCP/IP connec- 
tions and file descriptors. This specified set of system calls 
finds its correspondence in the set of routines of the process 
group interface of the underlying Totem system. 


5.2 Replication of Objects 


In the Eternal system, both client and server objects can 
be replicated. The object group abstraction of a replicated 
object enables any client object in the system to address 
the replicas of a server object as a whole, using a unique 
object group identifier. The translation of the object group 
identifier into the individual object references of the object 
group members is done transparently by Eternal. 

The objects of Eternal in the CORBA space are in one-to- 
one correspondence with processes in the Totem framework. 
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Eternal maintains the mapping between object groups and 
process groups, and extends the process group membership 
services of Totem to object groups. 

Most importantly, Eternal ensures that the states of the 
replicas of an object remain consistent. The reliable totally 
ordered multicasts of Totem guarantee that the replicas of 
an object “‘see’’ the same operations in the same order. 
However, in a system where replication is employed, it is 
possible for duplicate invocations and duplicate responses 
of objects to occur. These can potentially corrupt the state 
of an object. Eternal provides mechanisms to detect and 
suppress such duplicate operations. 

Eternal also manages the creation of new replicas and 
the removal of existing ones. It also undertakes the place- 
ment and distribution of replicas and handles the degree of 
replication of objects. 


6 Benefits of the Interception Approach 


6.1 Replication Transparency 


Using the interception approach, Eternal captures the calls 
of an object and transparently maps these calls onto an 
object group. Thus, a client object is only ever aware of 
addressing a single server object while, in fact, the request 
is communicated to each of the server replicas. Similarly, a 
server replica is only ever aware of returning its results to a 
single object while, in fact, the results are returned to all of 
the client replicas. 

Replication transparency allows the application devel- 
oper to write an object-oriented program for the application 
as if it were to run on a single machine, rather than across 
a distributed system. Eternal assumes the responsibility of 
replicating and locating the application objects, and main- 
taining the consistency of the replicas of the objects across 
the distributed system. 


6.2 Use with Commercial Off-the-Shelf ORBs 


The interception approach allows Eternal to ‘‘attach’’ itself 
transparently to any commercial off-the-shelf implementa- 
tion of the CORBA 2.0 standard. Thus, the ORB itself 
is never aware of its calls being traced, or of the interpo- 
sitioning of Eternal between the ORB and the operating 
system. 

This implies that any application operating on any com- 
mercial ORB could take advantage of the replication and 
fault tolerance capabilities of Eternal without any modifi- 
cation to the application code or to the internal structure of 
the ORB. 
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6.3 Use of the IIOP Interface 


The Internet Inter-ORB Protocol (IIOP) is supported by 
complete implementations of the CORBA 2.0 standard. It 
has a simple and generic interface, which is designed to 
facilitate communication between heterogeneous ORBs. 

Eternal captures the calls of the IIOP interface and maps 
them to Totem at the client, and receives the Totem multicast 
messages and maps them to JJOP at the server. Thus, it is 
possible for the client and the server objects to be hosted 
on entirely different ORBs, provided that these ORBs are 
equipped with IIOP. Thus, the replicas that constitute an 
object group could, in fact, be objects implemented in 
different languages and running over different ORBs. The 
only stipulation is that these objects be able to communicate 
over IIOP. Fortunately, an increasing number of vendors 
now supply IIOP as their native protocol. 


6.4 Performance 


The Eternal system is currently under development, using 
various implementations of the CORBA 2.0 standard that are 
commercially available, including the CORBA-compliant 
Inter-Language Unification (ILU) [3] from the Xerox Palo 
Alto Research Center. 

A typical application using a single server object and a 
single client object over ILU without Eternal involves 910 
object invocations per second. With three-way replication 
of the same client and server objects over ILU with Eternal, 
preliminary measurements yield results of 670 object invo- 
cations per second. These measurements indicate that the 
overhead associated with interception and multicasting is 
not unreasonable for replicating objects, particularly since 
the code of the ORB and the operating system are unmod- 
ified. With further optimization of the code of Eternal, we 
anticipate even better performance. 

Since replication of objects requires interaction between 
groups of replicas, some sort of underlying multicast group 
communication is required. The use of a reliable totally 
ordered multicast group communication system simplifies 
the ordering of operations at the replicas, and yields better 
performance than multiple point-to-point TCP/IP connec- 
tions between each pair of interacting replicas. The high 
performance of the underlying Totem multicast group com- 
munication system is exploited by Eternal to obtain good 
performance. 


7 Conclusion 


Eternal enhances the CORBA standard by allowing appli- 
cation objects to be replicated and distributed on different 
machines across the system, while maintaining consistency 
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of the replicas of the objects. Replication of objects across 
the distributed system provides tolerance to a variety of 
hardware and software faults. It also allows hardware and 
software components to be replaced while the system is 
live so that the application can continue to operate without 
interruption of service. 
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